Text Diff
Compare two text blocks line-by-line to identify additions, deletions, and unchanged content.
Back to all tools on ToolForge
Text A (Original)
Text B (Modified)
Diff (A vs B)
About Text Diff
This text diff tool compares two text blocks line-by-line using a set-based comparison algorithm to identify lines that exist only in the original (A), only in the modified (B), or in both versions. Output uses standard unified diff notation with - for deletions, + for additions, and space prefix for unchanged lines.
It is useful for code review before commits, comparing configuration files, reviewing document revisions, validating API response changes, debugging environment differences, checking log file variations, reviewing legal contract changes, and any scenario requiring visual comparison of text versions.
Diff Output Format
The diff output uses unified diff notation:
Legend: (space) = Unchanged line (exists in both A and B) - = Line removed (exists only in A) + = Line added (exists only in B) Example Output: line 1 unchanged line 2 unchanged - removed line from original line 3 unchanged + added line in modified version line 4 unchanged
Diff Algorithm Explanation
This tool uses a set-based comparison approach:
Algorithm Steps: 1. Split Text A into lines → array A 2. Split Text B into lines → array B 3. Create Set from B for O(1) lookup 4. Create Set from A for O(1) lookup 5. For each line in A: - If in Set B → unchanged (prefix with space) - If not in Set B → deletion (prefix with -) 6. For each line in B: - If not in Set A → addition (prefix with +) Note: This is a simplified diff. Advanced tools (Git, diff utility) use Myers' algorithm for optimal LCS matching.
Common Diff Formats
| Format | Notation | Use Case |
|---|---|---|
| Normal Diff | < original, > modified | Traditional Unix diff |
| Context Diff | - old, + new, ! changed | Shows surrounding context |
| Unified Diff | - deleted, + added | Git, GitHub, code review |
| Side-by-Side | Two columns | Visual comparison tools |
| Color Diff | Red/Green highlighting | Terminal output, PRs |
Diff Example: Code Revision
Original (A):
function calculateTotal(items) {
let total = 0;
for (let i = 0; i < items.length; i++) {
total += items[i].price;
}
return total;
}
Modified (B):
function calculateTotal(items) {
let total = 0;
for (let i = 0; i < items.length; i++) {
total += items[i].price * items[i].quantity;
}
return total;
}
Diff Output:
function calculateTotal(items) {
let total = 0;
for (let i = 0; i < items.length; i++) {
- total += items[i].price;
+ total += items[i].price * items[i].quantity;
}
return total;
}
Diff Example: Configuration Change
Original config.env: DEBUG=false PORT=3000 DATABASE_URL=postgres://localhost/mydb LOG_LEVEL=info Modified config.env: DEBUG=true PORT=3000 DATABASE_URL=postgres://localhost/mydb LOG_LEVEL=debug CACHE_TTL=3600 Diff Output: - DEBUG=false + DEBUG=true PORT=3000 DATABASE_URL=postgres://localhost/mydb - LOG_LEVEL=info + LOG_LEVEL=debug + CACHE_TTL=3600
Common Use Cases
| Domain | Use Case | Example |
|---|---|---|
| Software Development | Code review before commit | Review changes in PR |
| DevOps | Config file comparison | Prod vs staging configs |
| Legal | Contract revision tracking | Compare draft versions |
| Writing | Document editing | Track content changes |
| Data Engineering | Schema comparison | DDL diff between environments |
| Security | Policy change audit | Firewall rule changes |
Git Diff Commands Reference
Related Git commands for diff operations:
# Working directory vs staged git diff # Staged vs last commit git diff --staged # Between two commits git diff HEAD~1 HEAD # Between two branches git diff main feature-branch # Specific file diff git diff -- path/to/file.txt # Ignore whitespace git diff -w # Word-by-word diff (better for docs) git diff --word-diff
Limitations of Simple Diff
- No move detection: Moved lines appear as delete + add, not as moves
- No context awareness: Doesn't understand code structure or semantics
- Exact match only: "hello " vs "hello" (trailing space) counts as different
- No character-level diff: Shows entire line changed, not which characters
- Order insensitive: Same lines in different order may show as unchanged
Tips for Better Diff Results
- Format code consistently before comparing (use Prettier, eslint --fix)
- Remove trailing whitespace to avoid false positives
- Use meaningful line breaks (one statement per line)
- For config files, keep keys in consistent order
- Consider using specialized diff tools for JSON, XML, or YAML
- For large files, focus on specific sections or hunks
Frequently Asked Questions
- What is a diff algorithm and how does it work?
- A diff algorithm compares two sequences to find the longest common subsequence (LCS), then identifies what was added, removed, or changed. The most common is Myers' diff algorithm used by Git. It works by finding matching lines, then marking non-matching lines as additions (+) or deletions (-). This enables efficient version comparison and merge operations.
- What do the diff symbols mean?
- Standard diff notation: ' ' (space) = unchanged line, '-' = line removed from original, '+' = line added in new version. Some tools use colors: red for deletions, green for additions. Context diffs show surrounding lines; unified diffs show both changes together with @@ markers.
- What is the difference between line-by-line and character-by-character diff?
- Line-by-line diff compares entire lines, marking whole lines as added/removed. Character-by-character diff shows exact character changes within lines (useful for typos). Line diff is faster and better for code/docs; character diff is more precise for prose. This tool uses line-by-line comparison.
- How is diff used in version control?
- Git uses diff to show changes between commits, branches, and working directory. Commands like 'git diff', 'git diff --staged', and pull request views all use diff algorithms. Diffs enable code review, conflict resolution, and understanding what changed between versions without reading entire files.
- What are common diff output formats?
- Common formats: Normal diff (shows < for old, > for new), Context diff (- for old, + for new, ! for changed with context lines), Unified diff (most common, shows - and + with @@ hunk headers), Side-by-side diff (two columns for visual comparison), and Color diff (terminal colors for readability).
- What are common use cases for text diff?
- Common uses include: code review before commits, comparing config files after changes, reviewing document revisions, checking log file differences, validating API response changes, comparing database dumps, reviewing legal contract revisions, and debugging why two similar files produce different results.