Monday, March 21, 2022

Converting Ami Pro .SAM files to .doc or .txt

Ami Pro was by far the best word processor of it's time. That was the time of Windows 3.1, and later Windows 95. It was bought by Lotus, and instead of being developed into the word processor I wish I would have now, it eventually disappeared... 

Nowadays, there is no easy way to get to the content of these old .sam files. The files are just plain ASCII text (except when they have embedded bitmap images). But extracting the raw text from the files is not simple. For example, all accented characters are written in a strange format: "é" is written as "<\i>" in the file, "à" as "<\`>", etc.

After trying various solutions like installing Windows NT 4 into a virtual machine, or directly installing Lotus Ami Pro 3.1 into an old Windows XP VM, I came across mentions of a plugin for Microsoft Word that would allow it to read .sam files. That plugin itself was hard to find. It seems to have been included in old Microsoft converter packs which are not available anymore. This blog post from 2011 explains how to install the "Ami Pro" plugin from http://www.gmayor.com/downloads.htm but unfortunately the download is not available there anymore, saying "Sadly this old filter no longer appears to work".

Eventually, I could find it at http://www.lotusamipro.com/ where it can still be downloaded : http://www.lotusamipro.com/files/word2ami.zip

And it does work in MS Word 2003, which I had in an old Windows XP virtual machine.

So, if you have Word 2003,

  • Get that file from http://www.lotusamipro.com/files/word2ami.zip (or from here)
  • Copy "Ami332.cnv"
    to "C:\Program Files\Common Files\Microsoft Shared\TextConv\Ami332.cnv"
  • Open Word, and in the File / Open... window, under "Files of type:" select "Ami Pro 3.o (*.sam)" (or "All Files (*.*)")
    You will get this warning on which you will have to click "Yes":
    This file needs to be opened by the Ami Pro 3.0 text converter, which may pose a security risk if the file you are opening is a malicious file. Choose Yes to open this file only if you are sure it is from a trusted source.

If you have many files to convert, you can map macros to buttons in Word to make it easier. Here are 2 macros in that ancient VBS language which Word understands, to save the current file as ".doc" and as ".txt":

Sub SaveAsDOC()
' Save current document as .txt
    strDocName = ActiveDocument.Name
    strPath = ActiveDocument.Path & "\"
    intPos = InStrRev(strDocName, ".")
    strDocName = Left(strDocName, intPos - 1)
    strDocName = strPath & strDocName & ".doc"

    ActiveDocument.SaveAs _
        FileFormat:=wdFormatDocument, _
        FileName:=strDocName, _
        AddToRecentFiles:=True
End Sub

Sub SaveAsTXT()
' Save current document as .txt

    strDocName = ActiveDocument.Name
    strPath = ActiveDocument.Path & "\"
    intPos = InStrRev(strDocName, ".")
    strDocName = Left(strDocName, intPos - 1)
    strDocName = strPath & strDocName & ".txt"

    ActiveDocument.SaveAs _
        FileFormat:=wdFormatText, _
        FileName:=strDocName, _
        AddToRecentFiles:=True, _
        Encoding:=1252, _
        LineEnding:=wdCRLF
End Sub

If you are on Mac or Linux or have WSL installed in Windows, you may also want to use Bash to convert the .txt files from their Windows CP 1252 character set to UTF-8:

for f in *.txt; do recode cp1252/..utf8/ "$f"; done # using recode

Or if you don't have recode but have iconv:

for f in *.txt; do iconv -f cp1252 -t utf8 -o "$f.tmp" "$f" && mv -f "$f.tmp" "$f"; done

To set the modification time of the new files to the time of the originals, the touch command can be used in Bash :

for f in *.SAM; do touch -c -r "$f" "${f%%.SAM}.txt"; done  # date of .SAM file to .txt file
for f in *.SAM; do touch -c -r "$f" "${f%%.SAM}.doc"; done  # date of .SAM file to .doc file
# or for both .txt and .doc files a once;
for f in *.SAM; do touch -c -r "$f" "${f%%.SAM}.txt" "${f%%.SAM}.doc"; done

The Word converter does not import bitmap images embedded in the Ami Pro file. These can be extracted with te following perl script:

#!/usr/bin/env perl

## Extract bitmaps embedded in file (like in Ami Pro .SAM files)

use strict;

my $debug = 1;

my $file = shift;
die "Usage: $0 FILENAME\n" unless (-r $file);

open my $fh, '<:raw', $file;
read $fh, my $all, -s $fh;
close $fh;

my $filesize = -s $file;

my $count;
while ( $all =~ /(BM.{12})/sg ) {
    my $m = $1;
    warn "# ", join(" ", unpack("(H2)*", "$m")), "\n" if $debug;
    #https://en.wikipedia.org/wiki/BMP_file_format
    my ($bm, $size, $res1, $res2, $offset) = unpack "A2 V H4 H4 V", $m;
    if ( $offset > $size or $size > $filesize ) {
        warn "# Skipping false positive at $-[0] (size $size > file size $filesize)\n" if $debug;
        next;
    }

    warn "Found at $-[0]:\n",
          "BM     = $bm\n",
          "size   = $size\n",
          "res1   = $res1\n",
          "res2   = $res2\n",
          "offset = $offset\n" if $debug;

    $count++;
    my $bitmap = substr($all, $-[0], $size);
    print "Saving $file-$count.bmp\n";
    open my $bmfile, '>:raw', "$file-$count.bmp" or die;
    print $bmfile $bitmap;
}

Finally, an alternative which I only found afterwards is to install Lotus SmartSuite 9.8 which can be downloaded from the WinWorld site : https://winworldpc.com/product/lotus-smartsuite/9-8

That will also let you open Ami Pro files and save them in various other formats. One advantage is that when saving to Word 97 .doc files, embedded images are preserved.

Labels: , , , , , ,

Wednesday, February 10, 2010

PDF to Word conversion notes

Had a complex PDF to convert to something editable like .doc, so I had another look at what was available.

This comparative test from 2008 was very helpful, as were some readers' comments. It concluded by recommending the koolwire.com service, which was indeed quite good, and also very convenient because it can be used through email. It produced an RTF with mostly actual tables. Visually, however, the tables in this particular case would have needed quite some re-formatting to look like the original ones.

Several readers suggested the PDF-to-Word service at pdftoword.com. For me, this gave me the best looking results. It converted the complex tables into columnized sections instead, but that was fine. (As an aside, it is not very clear which engine this service is using. It is related to Nitro PDF, a commercial Windows application which is promoted from the pdtftoword.com page. Also, the Nitro PDF pages link to the free pdftoword.com service as their free version. However, the produced Word document mentions Solid Converter PDF, another commercial Windows application, in it's properties. Weird...)

I also tried the convertpdftoword.net service which others suggested. It also gave a good looking Word document, but built it with tons of independent text boxes which was quite unconvenient in my case. A closer look, showed that this service was actually using VeryPDF's PDF2Word, which produced an RTF file (but with a .doc extension). PDF2Word turns out to actually be a re-packaging of xpdf, and is free (GPL) software. The source is available, but VeryPDF sells the Windows executable.

The funny thing from theses tests: the only completely useless conversions happened to be the one from Adobe itself.

Conclusion: I had the best results with pdftoword.com. But it all depends on your source document and what you want to do with it.

Labels: , , , , , ,