Comprender el modelo de objetos de Git
Git es, en esencia, un sistema de archivos direccionable por contenido. Entender sus internals te ayuda a usar Git con más eficacia y a solucionar problemas complejos.
Los cuatro tipos de objetos
Git almacena todo como uno de estos cuatro objetos:
- Blob - Almacena el contenido de archivos
- Tree - Almacena la estructura de directorios
- Commit - Almacena un snapshot con metadatos
- Tag - Almacena información de tags anotados
Explorando objetos de Git
# View object type
git cat-file -t HEAD
# View object content
git cat-file -p HEAD
# View object size
git cat-file -s HEAD
# Explore a commit
git cat-file -p HEAD
# Output:
# tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
# parent 1234567890abcdef...
# author Name timestamp timezone
# committer Name timestamp timezone
#
# Commit message
Cómo Git guarda archivos
# Create a blob manually
echo "Hello, Git!" | git hash-object -w --stdin
# Returns: 8d0e41234f24b6da002d962a26c2495ea16a425f
# Read the blob
git cat-file -p 8d0e41234f24b6da002d962a26c2495ea16a425f
# Output: Hello, Git!
# View the tree of a commit
git cat-file -p HEAD^{tree}
# Output:
# 100644 blob abc123... README.md
# 040000 tree def456... src
Estructura del directorio .git
.git/
├── HEAD # Current branch reference
├── config # Repository configuration
├── description # Repository description
├── hooks/ # Git hooks scripts
├── index # Staging area
├── info/ # Additional info
│ └── exclude # Local ignore patterns
├── objects/ # All Git objects
│ ├── pack/ # Packed objects
│ └── info/ # Object info
└── refs/ # Branch and tag references
├── heads/ # Local branches
├── remotes/ # Remote-tracking branches
└── tags/ # Tags
Hashing SHA-1 de objetos
# Git computes SHA-1 as:
# SHA-1("blob " + content.length + "\0" + content)
# Manually compute a blob hash
echo -n "Hello, Git!" > test.txt
(echo -en "blob $(wc -c < test.txt)\0"; cat test.txt) | sha1sum
# Verify with Git
git hash-object test.txt
Packfiles
# Git packs objects for efficiency
git gc
# View pack information
git verify-pack -v .git/objects/pack/pack-*.idx
# Manually pack objects
git repack -a -d
# Unpack for inspection
git unpack-objects < .git/objects/pack/pack-*.pack
Referencias (Refs)
# View all refs
git show-ref
# View HEAD
cat .git/HEAD
# Output: ref: refs/heads/main
# View branch ref
cat .git/refs/heads/main
# Output: abc123def456...
# Create ref manually
echo "abc123def456..." > .git/refs/heads/new-branch
# Symbolic refs
git symbolic-ref HEAD refs/heads/main
El índice (staging area)
# View index contents
git ls-files -s
# View index in detail
git ls-files --stage
# Update index directly
git update-index --add --cacheinfo 100644,,filename
# Read tree into index
git read-tree HEAD
Entendiendo los commits
# View commit details
git log --format=raw -1
# Create commit manually (low-level)
tree=$(git write-tree)
echo "Commit message" | git commit-tree $tree -p HEAD
# View commit graph
git rev-list --all --graph
Usos prácticos
# Find large objects
git rev-list --objects --all | \
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | \
sed -n 's/^blob //p' | \
sort -rnk2 | \
head -20
# Recover deleted commit
git fsck --lost-found
# Look in .git/lost-found/commit/
# Verify repository integrity
git fsck --full