Ok, here at work I have a bunch (read 100's-1000's) of simple .txt files that have several columns. the 5th column in has numbers under a constistent label I will call a ptid.i need a way to grep all the data for each ptid # in the line and compile it together with data that is associated with the same ptid # accross multiple files. so it would go from many files that look like thisPROTOCOL # LABID(# and txt) ASSAYID# SPECID # PTID# VISITNO# VISITDAY# DRAWDT# ASSAYRUN(txt) ANTIGEN(txt) DILUTION# READING# READRANG# NOANT(#) NOANTRAN# TITER# TESTDT# RELIABLE(txt) REPLACE MODDT(txt) COMMENTS(txt)each being a columnto thisPTID# - all the data in some order across for this PTID# in every file
8/16/2006 9:40:01 AM
perl + 5min work = $profit
8/16/2006 9:52:18 AM
excel - macro to import files, then some formulas or another macro or pivot table to aggregate all the data
8/16/2006 9:57:31 AM
yeah i was thinking perl or excel....
8/16/2006 10:00:09 AM
to see how to do the import, open excel and start recording (Tools > Macro > Record New Macro), then open the .txt file and import it using tab delimited or whatever. Then look at the macro that was generated, and it will probably be just a single function call for opening a file. You'll have to surround that in a loop that passes in the name of the files. Then you'll have to add in the actions for the column you want (do a lot of Macro Recording, performing actions, then reading the generated code).
8/16/2006 10:05:57 AM
Assuming your PTID is your primary key, are the columns different in each text file? If not, how are you going to differentiate Protocol # in Text File A from Protocol # in Text File B?
8/16/2006 11:09:10 AM
8/16/2006 11:11:21 AM
yes, Perl can grep/search/regex/whatever a 1000 text files in no time.
8/16/2006 11:18:47 AM
indeed, no all the columns should be the same for all the files.... but not all columns will have entries....pretty sure.... there are several dozen different 'formats' that i need to look through to verify that....and yes, perl if scripted correctly can do that....
8/16/2006 2:34:10 PM
perl for sure
8/16/2006 3:21:08 PM
rather there will be 4 'compliations' of the data, each having a different form, each of the forms pertains to specific file sets so all the data for each will be uniform in organization
8/17/2006 1:03:41 PM
fuck i had a excel IF formula set up where it would remove duplicate ptids like this12040190120401901204019012040196etc and turn it into1204019012040196but i fucked it up somehow and lost the IF statement.... and i had it working too....
8/31/2006 9:29:51 AM
n/m fixed it
8/31/2006 9:42:16 AM
assuming you have your list starting in A1, the formula starting in B2 and filled down is=IF(A2=A1,"",A2)that will replace the duplicates with blanks
8/31/2006 9:45:33 AM