Objects
There are four types of Git objects: blobs, trees, commits and tags. For each
one pygit2 has a type, and all four types inherit from the base Object
type.
Object lookup
In the previous chapter we learnt about Object IDs. With an Oid we can ask the
repository to get the associated object. To do that the Repository
class
implementes a subset of the mapping interface.
- class pygit2.Repository(path: str | None = None, flags: ~pygit2.enums.RepositoryOpenFlag = <RepositoryOpenFlag.DEFAULT: 0>)
- get(key, default=None)
Return the Git object for the given id, returns the default value if there’s no object in the repository with that id. The id can be an Oid object, or an hexadecimal string.
Example:
>>> from pygit2 import Repository >>> repo = Repository('path/to/pygit2') >>> obj = repo.get("101715bf37440d32291bde4f58c3142bcf7d8adb") >>> obj <_pygit2.Commit object at 0x7ff27a6b60f0>
- __getitem__(id)
Return the Git object for the given id, raise
KeyError
if there’s no object in the repository with that id. The id can be an Oid object, or an hexadecimal string.
- __contains__(id)
Returns True if there is an object in the Repository with that id, False if there is not. The id can be an Oid object, or an hexadecimal string.
The Object base type
The Object type is a base type, it is not possible to make instances of it, in any way.
It is the base type of the Blob
, Tree
, Commit
and Tag
types, so
it is possible to check whether a Python value is an Object or not:
>>> from pygit2 import Object
>>> commit = repository.revparse_single('HEAD')
>>> print(isinstance(commit, Object))
True
All Objects are immutable, they cannot be modified once they are created:
>>> commit.message = u"foobar"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: attribute 'message' of '_pygit2.Commit' objects is not writable
Derived types (blobs, trees, etc.) don’t have a constructor, this means they cannot be created with the common idiom:
>>> from pygit2 import Blob
>>> blob = Blob("data")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create '_pygit2.Blob' instances
New objects are created using an specific API we will see later.
This is the common interface for all Git objects:
- class pygit2.Object
Base class for Git objects.
- __eq__(value, /)
Return self==value.
- __hash__()
Return hash(self).
- __ne__(value, /)
Return self!=value.
- __repr__()
Return repr(self).
- filemode
An enums.FileMode constant (or None if the object was not reached through a tree)
- id
The object id, an instance of the Oid type.
- name
Name (or None if the object was not reached through a tree)
- peel(target_type) Object
Peel the current object and returns the first object of the given type.
If you pass None as the target type, then the object will be peeled until the type changes. A tag will be peeled until the referenced object is no longer a tag, and a commit will be peeled to a tree. Any other object type will raise InvalidSpecError.
- read_raw() bytes
Returns the byte string with the raw contents of the object.
- short_id
An unambiguous short (abbreviated) hex Oid string for the object.
- type
One of the enums.ObjectType.COMMIT, TREE, BLOB or TAG constants.
- type_str
One of the ‘commit’, ‘tree’, ‘blob’ or ‘tag’ strings.
Blobs
A blob is just a raw byte string. They are the Git equivalent to files in a filesytem.
This is their API:
- class pygit2.Blob
Blob object.
Blobs implement the buffer interface, which means you can get access to its data via memoryview(blob) without the need to create a copy.
- data
The contents of the blob, a byte string. This is the same as Blob.read_raw().
Example, print the contents of the
.gitignore
file:>>> blob = repo['d8022420bf6db02e906175f64f66676df539f2fd'] >>> print(blob.data) MANIFEST build dist
- diff([blob: Blob, flag: int = GIT_DIFF_NORMAL, old_as_path: str, new_as_path: str]) Patch
Directly generate a
pygit2.Patch
from the difference between two blobs.Returns: Patch.
Parameters:
- blobBlob
The
Blob
to diff.- flag
A GIT_DIFF_* constant.
- old_as_pathstr
Treat old blob as if it had this filename.
- new_as_pathstr
Treat new blob as if it had this filename.
- diff_to_buffer(buffer: bytes = None, flag: int = GIT_DIFF_NORMAL[, old_as_path: str, buffer_as_path: str]) Patch
Directly generate a
Patch
from the difference between a blob and a buffer.Returns: Patch.
Parameters:
- bufferbytes
Raw data for new side of diff.
- flag
A GIT_DIFF_* constant.
- old_as_pathstr
Treat old blob as if it had this filename.
- buffer_as_pathstr
Treat buffer as if it had this filename.
- is_binary
True if binary data, False if not.
- size
Size in bytes.
Example:
>>> print(blob.size) 130
Creating blobs
There are a number of methods in the repository to create new blobs, and add them to the Git object database:
- class pygit2.Repository(path: str | None = None, flags: ~pygit2.enums.RepositoryOpenFlag = <RepositoryOpenFlag.DEFAULT: 0>)
- create_blob(data: bytes) Oid
Create a new blob from a bytes string. The blob is added to the Git object database. Returns the oid of the blob.
Example:
>>> id = repo.create_blob('foo bar') # Creates blob from a byte string >>> blob = repo[id] >>> blob.data 'foo bar'
- create_blob_fromdisk(path: str) Oid
Create a new blob from a file anywhere (no working directory check).
- create_blob_fromiobase(io.IOBase) Oid
Create a new blob from an IOBase object.
- create_blob_fromworkdir(path: str) Oid
Create a new blob from a file within the working directory. The given path must be relative to the working directory, if it is not an error is raised.
There are also some functions to calculate the id for a byte string without creating the blob object:
- pygit2.hash(data: bytes) Oid
Returns the oid of a new blob from a string without actually writing to the odb.
- pygit2.hashfile(path: str) Oid
Returns the oid of a new blob from a file path without actually writing to the odb.
Streaming blob content
pygit2.Blob.data and pygit2.Blob.read_raw() read the full contents of the
blob into memory and return Python bytes
. They also return the raw contents
of the blob, and do not apply any filters which would be applied upon checkout
to the working directory.
Raw and filtered blob data can be accessed as a Python Binary I/O stream (i.e. a file-like object):
- class pygit2.BlobIO(blob: ~_pygit2.Blob, as_path: str | None = None, flags: ~pygit2.enums.BlobFilter = <BlobFilter.CHECK_FOR_BINARY: 1>, commit_id: ~_pygit2.Oid | None = None)
Read-only wrapper for streaming blob content.
Supports reading both raw and filtered blob content. Implements io.BufferedReader.
Example:
>>> with BlobIO(blob) as f: ... while True: ... # Read blob data in 1KB chunks until EOF is reached ... chunk = f.read(1024) ... if not chunk: ... break
By default, BlobIO will stream the raw contents of the blob, but it can also be used to stream filtered content (i.e. to read the content after applying filters which would be used when checking out the blob to the working directory).
Example:
>>> with BlobIO(blob, as_path='my_file.ext') as f: ... # Read the filtered content which would be returned upon ... # running 'git checkout -- my_file.txt' ... filtered_data = f.read()
Trees
At the low level (libgit2) a tree is a sorted collection of tree entries. In pygit2 accessing an entry directly returns the object.
A tree can be iterated, and partially implements the sequence and mapping interfaces.
- class pygit2.Tree
Tree objects.
- __getitem__(name)
Tree[name]
Return the Object subclass instance for the given name. Raise
KeyError
if there is not a tree entry with that name.
- __truediv__(name)
Tree / name
Return the Object subclass instance for the given name. Raise
KeyError
if there is not a tree entry with that name. This allows navigating the tree similarly to Pathlib using the slash operator via.Example:
>>> entry = tree / 'path' / 'deeper' / 'some.file'
- __contains__(name)
name in Tree
Return True if there is a tree entry with the given name, False otherwise.
- __len__()
len(Tree)
Return the number of objects in the tree.
- __iter__()
for object in Tree
Return an iterator over the objects in the tree.
- diff_to_index(index: Index, flags: enums.DiffOption = enums.DiffOption.NORMAL, context_lines: int = 3, interhunk_lines: int = 0) Diff
Show the changes between the index and a given
Tree
.Parameters:
- index
Index
The index to diff.
- flags
A combination of enums.DiffOption constants.
- context_lines
The number of unchanged lines that define the boundary of a hunk (and to display before and after).
- interhunk_lines
The maximum number of unchanged lines between hunk boundaries before the hunks will be merged into a one.
- index
- diff_to_tree([tree: Tree, flags: enums.DiffOption = enums.DiffOption.NORMAL, context_lines: int = 3, interhunk_lines: int = 0, swap: bool = False]) Diff
Show the changes between two trees.
Parameters:
- tree:
Tree
The tree to diff. If no tree is given the empty tree will be used instead.
- flags
A combination of enums.DiffOption constants.
- context_lines
The number of unchanged lines that define the boundary of a hunk (and to display before and after).
- interhunk_lines
The maximum number of unchanged lines between hunk boundaries before the hunks will be merged into a one.
- swap
Instead of diffing a to b. Diff b to a.
- tree:
- diff_to_workdir(flags: enums.DiffOption = enums.DiffOption.NORMAL, context_lines: int = 3, interhunk_lines: int = 0) Diff
Show the changes between the
Tree
and the workdir.Parameters:
- flags
A combination of enums.DiffOption constants.
- context_lines
The number of unchanged lines that define the boundary of a hunk (and to display before and after).
- interhunk_lines
The maximum number of unchanged lines between hunk boundaries before the hunks will be merged into a one.
Example:
>>> tree = commit.tree
>>> len(tree) # Number of entries
6
>>> for obj in tree: # Iteration
... print(obj.id, obj.type_str, obj.name)
...
7151ca7cd3e59f3eab19c485cfbf3cb30928d7fa blob .gitignore
c36f4cf1e38ec1bb9d9ad146ed572b89ecfc9f18 blob COPYING
32b30b90b062f66957d6790c3c155c289c34424e blob README.md
c87dae4094b3a6d10e08bc6c5ef1f55a7e448659 blob pygit2.c
85a67270a49ef16cdd3d328f06a3e4b459f09b27 blob setup.py
3d8985bbec338eb4d47c5b01b863ee89d044bd53 tree test
>>> obj = tree / 'pygit2.c' # Get an object by name
>>> obj
<_pygit2.Blob at 0x7f08a70acc10>
Creating trees
- class pygit2.Repository(path: str | None = None, flags: ~pygit2.enums.RepositoryOpenFlag = <RepositoryOpenFlag.DEFAULT: 0>)
- TreeBuilder([tree]) TreeBuilder
Create a TreeBuilder object for this repository.
- class pygit2.TreeBuilder
TreeBuilder objects.
- clear()
Clear all the entries in the builder.
- insert(name: str, oid: Oid, attr: FileMode)
Insert or replace an entry in the treebuilder.
Parameters:
- attr
Available values are FileMode.BLOB, FileMode.BLOB_EXECUTABLE, FileMode.TREE, FileMode.LINK and FileMode.COMMIT.
- remove(name: str)
Remove an entry from the builder.
- write() Oid
Write the tree to the given repository.
Commits
A commit is a snapshot of the working dir with meta informations like author, committer and others.
- class pygit2.Commit
Commit objects.
- author
The author of the commit.
- commit_time
Commit time.
- commit_time_offset
Commit time offset.
- committer
The committer of the commit.
- gpg_signature
A tuple with the GPG signature and the signed payload.
- message
The commit message, a text string.
- message_encoding
Message encoding.
- message_trailers
Returns commit message trailers (e.g., Bug: 1234) as a dictionary.
- parent_ids
The list of parent commits’ ids.
- parents
The list of parent commits.
- raw_message
Message (bytes).
- tree
The tree object attached to the commit.
- tree_id
The id of the tree attached to the commit.
Signatures
The author and committer attributes of commit objects are Signature
objects:
>>> commit.author
pygit2.Signature('Foo Ibáñez', 'foo@example.com', 1322174594, 60, 'utf-8')
Signatures can be compared for (in)equality.
Creating commits
- class pygit2.Repository(path: str | None = None, flags: ~pygit2.enums.RepositoryOpenFlag = <RepositoryOpenFlag.DEFAULT: 0>)
- create_commit(reference_name: str, author: Signature, committer: Signature, message: bytes | str, tree: Oid, parents: list[Oid][, encoding: str]) Oid
Create a new commit object, return its oid.
Commits can be created by calling the create_commit
method of the
repository with the following parameters:
>>> author = Signature('Alice Author', 'alice@authors.tld')
>>> committer = Signature('Cecil Committer', 'cecil@committers.tld')
>>> tree = repo.TreeBuilder().write()
>>> repo.create_commit(
... 'refs/heads/master', # the name of the reference to update
... author, committer, 'one line commit message\n\ndetailed commit message',
... tree, # binary string representing the tree object ID
... [] # list of binary strings representing parents of the new commit
... )
'#\xe4<u\xfe\xd6\x17\xa0\xe6\xa2\x8b\xb6\xdc35$\xcf-\x8b~'