|
I'm releasing the files I'll attach here under the MIT license.
The original file-list was limited to files in the src/ directory. I removed various files from that list that people identified as belonging to/originating from third parties.
VirtualBox's source structure:
src/ - the list of bad words was built from here
include/ - corrections were also applied here
tools/ - this was intentionally ignored
debian/ - this was skipped
doc/ - this was skipped (unfortunately)
Tools
build-word-list.pl - generate word-list from files in file-list
aspell list < word-list > bad-list - generates bad-list using aspell (run on OS X 10.6.4)
vi - not included; used to delete large swaths of obviously uninteresting strings from bad-list
search-file-list.pl - used while manually pruning bad-list
tabify-bad-list.pl - used for initial conversion from bad-list to replacement-instructions
process-word-and-file-list.pl - used to apply original-replacement-instructions.txt
Files
word-list - not included but available, it's 640k compressed (1.7mb originally)
bad-list - I don't have a canonical version of this, as I was pruning file-list, it should be roughly 92% of word-list
files-to-munge.txt - this is the final file-list, effectively only files in this list should be patched and could contributed to misspellings
original-replacement-instructions.txt - this is roughly the actions taken by process-word-and-file-list.pl to produce the patches + series
series - ordered list of patches to apply. The series starts with a couple of items which I believe affect the user experience, followed by an alphabetically sorted set of patches.
- patchfile - the header indicates which replacements it should contain. Not all patches are independent, a number of them depend on other files earlier in
series.
connectix - The reason there's an uneven number of insertions/deletions is that "conectix" was apparently intentionally misspelled by Connectix as part of its file header. You could of course omit this change (or generally speaking any others).
The end result of this is:
991 files changed, 3834 insertions(+), 3833 deletions(-)
in 742 patches.
A straight diff -U0 of the changes (folded together) is 13959 lines and 1009876 bytes. I can provide this flat version too, however I can't imagine it'd be remotely useful for anyone.
One odd thing I came across while patching things was STRINGFY() which is used in Etherboot, it's clearly a VirtualBox specific change, but I can't actually find a definition for it anywhere. I won't be including it in this attachment as I've segregated the etherboot patches.
Oh, I'm working from an svn pull (well, an hg convert thereof) from:
| date: Tue Oct 19 00:16:34 2010 +0000
| summary: wddm/3d: cromium hgsmi fixes
|