I use both PT4 and H2N. 80% of my database (10 million+ hh) uses 'player names' not 'id' as I didn't realize it made a difference until I was so far along I couldn't justify switching and losing the hh I'd accumulated. H2N only reads 'id' and no easy solutions exist that I'm aware for converting 'non-standard' hh formats into standard, widely recognized ones. So basically after months of looking into it off and on, I figured out how to use text editing software to reformat the hand histories so both pt4 and h2n will read them.
A few questions:
I have tens of millions of duplicates and I've been importing all my hand histories into pt4 on a secondary cpu; with the idea that I'll then export them and the duplicates will be gone. Are there any alternatives to this?
On above average cpu's is there a # of hands per file that pt4 reads faster than any other? Obviously, 1 hand per file would be the slowest import option but does the speed start to peak somewhere between 500-20k per file?
9 times out of 10 when exporting, the hh exactly matches the 'hands-per-file' specification but occasionally pt4 just starts making shit up; does this mean I need to re-index or update cache?
Is pt4 import/export/processing speed etc. affected by the # of stats (custom or otherwise) in a database?
I basically want to import/export as large # of hh as fast as possible and am willing to temporally strip pt4 of functionality if it will hasten the outcome.
I'm long past serving as a proto-typical example of sunk-cost fallacy, haha. Any suggestions or tips would be much appreciated.
Thanks.