TechLead
Lección 1 de 9
5 min de lectura
Git Avanzado

Internals de Git y modelo de objetos

Entiende cómo Git almacena datos internamente con blobs, trees, commits y refs.

Comprender el modelo de objetos de Git

Git es, en esencia, un sistema de archivos direccionable por contenido. Entender sus internals te ayuda a usar Git con más eficacia y a solucionar problemas complejos.

Los cuatro tipos de objetos

Git almacena todo como uno de estos cuatro objetos:

  • Blob - Almacena el contenido de archivos
  • Tree - Almacena la estructura de directorios
  • Commit - Almacena un snapshot con metadatos
  • Tag - Almacena información de tags anotados

Explorando objetos de Git

# View object type
git cat-file -t HEAD

# View object content
git cat-file -p HEAD

# View object size
git cat-file -s HEAD

# Explore a commit
git cat-file -p HEAD
# Output:
# tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
# parent 1234567890abcdef...
# author Name  timestamp timezone
# committer Name  timestamp timezone
#
# Commit message

Cómo Git guarda archivos

# Create a blob manually
echo "Hello, Git!" | git hash-object -w --stdin
# Returns: 8d0e41234f24b6da002d962a26c2495ea16a425f

# Read the blob
git cat-file -p 8d0e41234f24b6da002d962a26c2495ea16a425f
# Output: Hello, Git!

# View the tree of a commit
git cat-file -p HEAD^{tree}
# Output:
# 100644 blob abc123... README.md
# 040000 tree def456... src

Estructura del directorio .git

.git/
├── HEAD           # Current branch reference
├── config         # Repository configuration
├── description    # Repository description
├── hooks/         # Git hooks scripts
├── index          # Staging area
├── info/          # Additional info
│   └── exclude    # Local ignore patterns
├── objects/       # All Git objects
│   ├── pack/      # Packed objects
│   └── info/      # Object info
└── refs/          # Branch and tag references
    ├── heads/     # Local branches
    ├── remotes/   # Remote-tracking branches
    └── tags/      # Tags

Hashing SHA-1 de objetos

# Git computes SHA-1 as:
# SHA-1("blob " + content.length + "\0" + content)

# Manually compute a blob hash
echo -n "Hello, Git!" > test.txt
(echo -en "blob $(wc -c < test.txt)\0"; cat test.txt) | sha1sum

# Verify with Git
git hash-object test.txt

Packfiles

# Git packs objects for efficiency
git gc

# View pack information
git verify-pack -v .git/objects/pack/pack-*.idx

# Manually pack objects
git repack -a -d

# Unpack for inspection
git unpack-objects < .git/objects/pack/pack-*.pack

Referencias (Refs)

# View all refs
git show-ref

# View HEAD
cat .git/HEAD
# Output: ref: refs/heads/main

# View branch ref
cat .git/refs/heads/main
# Output: abc123def456...

# Create ref manually
echo "abc123def456..." > .git/refs/heads/new-branch

# Symbolic refs
git symbolic-ref HEAD refs/heads/main

El índice (staging area)

# View index contents
git ls-files -s

# View index in detail
git ls-files --stage

# Update index directly
git update-index --add --cacheinfo 100644,,filename

# Read tree into index
git read-tree HEAD

Entendiendo los commits

# View commit details
git log --format=raw -1

# Create commit manually (low-level)
tree=$(git write-tree)
echo "Commit message" | git commit-tree $tree -p HEAD

# View commit graph
git rev-list --all --graph

Usos prácticos

# Find large objects
git rev-list --objects --all | \
  git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | \
  sed -n 's/^blob //p' | \
  sort -rnk2 | \
  head -20

# Recover deleted commit
git fsck --lost-found
# Look in .git/lost-found/commit/

# Verify repository integrity
git fsck --full

Continuar aprendiendo