http://blogs.conchango.com/jamiethomson/archive/2006/06/22/4116.aspx
I use to do this with a small batch file. The batch file is called concat.bat and takes in the following parameters: %1 for the extension of the files to concatenate, e.g. txt and %2 for the folder in which the files are stored.
The body of the batch file is as follows:
set I=Example of the usage of the batch file:
for /F %%f in ('dir "%2"*.%1 /B /O:N /A:-D') do set I=%%f
if not defined I goto EXIT
copy "%2"*.%1 "%2"\backup
copy "%2"*.%1 "%2"%I:~0,5%concat.tmp
if exist "%2"%I:~0,5%concat.tmp del "%2"*.%1 > NUL
ren "%2"*concat.tmp *.%1
:EXIT
The batch files is called with %1 = .txt and %2 = c:\temp. Say there are three files in c:\temp called test1.txt, test2.txt, and test3.txt with create dates 2006/01/01, 2006/01/02, and 2006/01/03 respectively. The result is one textfile called text1.txt containing the content of all three files.
Explanation of each line of code:
Set I=
dimensions the variable I.
for /F %%f in ('dir "%2"*.%1 /B /O:N /A:-D') do set I=%%f
iterates over all files in folder c:\temp with extension ".txt", ordered descending by create date. Each iteration "do set I=%%f" assigns the filename to variable I. After iteration I contains the value of the last filename in the list. This is the file created first according to create date, since the list is sorted decendantly.
if not defined I goto EXIT
checks if I has a value, if not, the batch ends. I.e. there where no .txt files in the folder.
copy "%2"*.%1 "%2"\backup
copies the three sourcefiles to a backup folder. This is done for auditing purposes.
copy "%2"*.%1 "%2"%I:~0,5%concat.tmp
concatenates all .txt files to one file with the filename stored in I. So, a file called test1concat.tmp is created, containing the union/concatenation of the three files.
if exist "%2"%I:~0,5%concat.tmp del "%2"*.%1 > NUL
does a check to ensure the .tmp file has been created and deletes the three source files, so the inbox in emptied.
ren "%2"*concat.tmp *.%1
renames the .tmp file to the final filename, in this case text1.txt.
EXIT:
exits the batch file.
Currently it only works if all files have the same filename length, so 5 (="textX") in the case of the example. Of course this can be optimized, e.g. search for the first dot and take the left part of the file name.
I have no idea if this approach is better with regard to performance. I haven't tested it on thousands of files and have not done a comparison to the approach used in the post by Jamie. On the other hand, this approach will also work nicely in DTS, where there is no ForEach Loop.
Note:
The "for /F" functionality is not available in all Windows versions. It is available in Windows 2003 server and Windows XP.
Hi, looking for how to concat the contents (all .txt files or .doc files) of a folder into a single .txt or .doc file, without specifying filenames. In MS-DOS!
ReplyDeleteThanks, and keep up the good work,
J
Fantastic website I loved reading your info
ReplyDelete[url=http://partyopedia.com]party supplies[/url]