DTACK GROUNDED #5
December 1981

DTACK GROUNDED, The Journal of Simple 68000 Systems
Issue # 5 December 1981 Copyright Digital Acoustics, Inc.

Once upon a time, at a division of a personal computer manufacturer, there was a chief software type person who discovered a wondrous new device called a 68000. Upon studying the instruction set and execution speed of this device, he decided that he absolutely HAD to have one so that he could create new and wonderful programs. He then approached the chief hardware type person at his division and said, "Provide me with one of these marvelous devices so that I can create new and wonderful programs."

The hardware type person replied, "Alas, although the 68000 is truly a marvelous device, it is excessively complex and expensive. It will never be possible to use one in an inexpensive application of the type favored by our employer." The software type person went away unhappily.

But one day the software type person read an advertisement which offered a 68000 board for about $600 which would work with the personal computers manufactured by his employer. He presented this advertisement to the hardware type who said disdainfully, "That advertisement is by some fly-by-night outfit which obviously doesn't know what it is doing. That 'product' will never be real."

The software type, being stupid and naive, placed an order for a board anyway. And the nose of the chief hardware type person is now red and raw from having a 6.5 by 15 inch printed circuit board rubbed on it. If we at DTACK GROUNDED had known about this BEFORE we shipped the board we would not have cut the pins off the I.C. sockets!

photograph of circuit board (9KB JPEG)
UNREAL PRODUCTS MANUFACTURED BY FLY-BY-NIGHT OUTFIT

Page 2

ABOUT THOSE MOTOROLA 68000 FLOATING POINT ROMS: From time to time over the past three years there have been reports in the press of the future availability of floating point packages in ROM from Motorola for the 68000. Since several readers have asked about this, here is the latest scoop from our sources at Motorola:

There are TWO different packages. The first is a 32 bit single precision format (6 decimal digit precision) which is not IEEE compatible but is rather optimized to run as quickly as possible with the 68000 instruction set. The multiply time is given as 44 usec, divide in 75 usec and SIN/COS in 410 usec. These speeds are for an 8MHz 68000 and are FASTER than the AMD9511A, even in the 4MHz version made by Intel! The Motorola part number is M68KFFP, and is available on floppy disk for about $500 from Motorola's Microsystems Group in Phoenix. The writeup in EDN was wrong in that there ARE licensing restrictions on the package. No decision has been made whether or not to sell the software in ROM.

Motorola also has an IEEE compatible F.P. package, both single and double precision, which is being held back from general release either on disk or in ROM. However, we are told the package DOES exist and is a part of the EXORMACS run-time PASCAL package.

We suggest that you refrain from holding your breath until Motorola releases any of this software in ROM. There is some hope- they DID finally release the 6809 floating point, both single and double precision (but wayyyyyy slower) in ROM as part number MC6839 at the bargain price of $67.50!

NOW FOR THOSE 68000 MONITOR ROUTINES (FINALLY!): Here is a BASIC program line to transfer N bytes from the Pet/CBM starting at memory location B to the 68000 starting at memory location A:

100 SYSQ: REM 07 AAAAAA BBBB NNNN (spaces not needed)
110 SYSQ: REM 07 0010F4 A800 0800 (actual code to transfer FP package)

Note that the format requires a three byte address for the 68000 since the device potentially has more than 64K RAM. Line 110 transfers 2K bytes from location $A800- AFFF in the CBM to location $0010F4- 0018F4 in the 68000.

The following code transfers 256 bytes from location $001C00- 001CFF in the 68000 to location $8A00-$8AFF in the CBM:

120 SYSQ: REM 08 001C00 8A00 0100 (spaces not needed)

The following code causes the 68000 to jump to location $001E7A and begin executing the code it finds there:

130 SYSQ: REM 02 001E7A

The following three lines of code cause $8888 to be loaded into D4, $9999 to be loaded into D5, and then traps to vector #15, which is semi-dedicated to a routine which places the 68000 registers in the 68000 RAM and then sends them to the host processor for display. The last line calls a utility routine which receives the 66 bytes from the 68000 and formats them on the display:

140 SYSQ:REM 09 001D00 0A 383C 8888 3A3C 9999 4E4F
150 SYSQ:REM 02 001D00     (execute code)
160 SYSP+27                (format registers on display)

Command 09 requires a 3 byte address, a one byte number which specifies the number of bytes to be sent to the 68000, then the needed code.

Page 3

What we have just outlined are four of the eleven monitor commands which are easily callable from BASIC. These are the only four that you really need to put you in business. Command 07 moves bytes to the 68000 from the host, 08 moves bytes from the 68000 to the host, 02 causes the 68000 to go to a location and begin executing code, and 09 transfers small code patches, which is useful for debugging and general code development. The procedure of lines 140, 150 and 160 is what we use when we are not sure how a particular 68000 instruction works. For the record, the code in line 140 is:

MOVE .W #$8888, D4
MOVE .W #$9999, D5
TRAP #15     (trap to routine which sends registers to the host)

Running that code and displaying the registers will confirm that $8888 and $9999 are successfully loaded into D4 and D5. To multiply D4 times D5 and leave the 32 bit result in D5, we add MULU D4, D5 to line 140. Line 140 will then look like this (note the increase of N from $0A to $0C):

140 SYSQ:REM 09 001D00 0C 383C 8888 3A3C 9999 CAC4 4E4F

Our "hand assembler's helper" will provide the correct 68000 code for these operations, of course. So what you have is a crude but usable 68000 development tool. Later, we will be able to assemble larger groups of 68000 code and automatically transfer the code up into the 68000 for execution.

Breakpoints are easily inserted. The following three BASIC lines will insert a breakpoint at $001D3C in the 68000 and then display the 68000 registers on the host CRT:

170 SYSQ:REM 09 001D3C 02 414F       (trap #15 @ 001D3C)
180 SYSQ:REM 02 001D00               (run @ $001D00, for instance)
190 SYSP+27                          (display registers)

This is the procedure we used to debug our 68000 floating point package.

OTHER 68000 MONITOR ROUTINES (brief description):

Command 00: Set ONE read/write address in the 68000 and a DIFFERENT address in the host
Command 01: Set the byte count N to be transferred BOTH in the host AND in the 68000
Command 03: Send 256 bytes to the 68000 from the host
Command 04: Send 256 bytes from the 68000 to the host
Command 05: Send N bytes to the 68000 from the host
Command 06: Send N bytes from the 68000 to the host
Command 0A: Send a three byte address to address register 7
Commands 0B thru 0F: Reserved but not implemented

As we can see from the above, Command 07 is a composite of the primitives 00, 01 and 05. Command 08 is a composite of the primitives 00, 01 and 06. Although we have only run this code on the Pet/CBM as of this writing, Applesoft should work in the same way. ALL OF THESE COMMANDS ARE CALLABLE FROM MACHINE LANGUAGE.

How about commands $10 thru $FF? Well, those commands assume that an appropriate jump table has been loaded into the 68000 RAM beginning at $0010F4. The jump table contains TWO byte addresses, so that all commands must jump to code located within the 64K byte zero page. Commands $10 thru $19 are already reserved for the floating point package.

Command 02 is used to expand the effective number of 'commands' and also to jump to code either inside or outside the zero page.

Page 4

THIS MONTH'S DAVID STOCKMAN AWARD FOR PRUDENCE AND FORETHOUGHT goes to the Apple employee who wrote "copy to Mike Markulla; copy to legal department" on the front of DTACK GROUNDED newsletter #4 and then made at least three illegal photocopies of our copyrighted newsletter, sending one OUTSIDE Apple Computer. The recipient of the copy sent outside Apple naturally phoned us, which is how we heard about it. We are now trying to persuade the owner of that outside copy to part with it so we can frame it and hang it on our wall.

THIS MONTH'S 'TACKY' AWARD goes to ANY legal department ANYWHERE which would knowingly work with illegal photocopies of any copyrighted publication. Those guys are SUPPOSED to have high ethical standards; they are officers of the court and all that. Tsk.

We have evidence that our readership is MUCH higher than would be indicated by our subscription list. We have reached the shocking conclusion that there are several persons out there who make ILLEGAL photocopies of our copyrighted newsletter!

We want those several individuals to know that we are going to get VERY upset and consider legal action if they make more than 300,000 photocopies! As we have mentioned before, our attitude toward private individuals making copies is very different from our attitude toward commercial enterprises making copies for business reasons.

DTACK GROUNDED has received its first anonymous letter. Although there is no return address, the postmark was San Jose, Nov. 4.

Those who are familiar with copyright law will be aware that the writer of a personal letter retains legal copyright to the content of that letter even if there is no notice to that effect. Congress passed that part of the copyright law to protect the privacy of the ordinary citizen's correspondence. But since the letter in question is unsigned, WHO owns the copyright? We have decided that WE do, so the letter is reproduced on the next page. After all, we did promise to answer correspondence, and this seems to be the only way we can do that in this particular case.

The upper half of page one clearly reveals that the writer has not read newsletters 1, 2 and 3. The lower half contains opinions to which the writer is entitled. We DO want to mention that the software portability we were referring to mainly involved the high level languages THEMSELVES, plus machine language application programs. The point made by the writer regarding applications programs written in FULLY STANDARDIZED high level languages is correct.

The point raised by the writer on the top of the second page is absolutely, totally, right on target. However, it should be apparent that the new generation of friendly software will be much more complex and require a larger memory space. To execute this large, complex program and retain adequate response time we need a very fast CPU with a large memory space. The reason we had not made this point in previous newsletters is that the obvious response to us would be, "So write that large, complex, friendly program!". As we have admitted previously, we are mostly hardware types here. Writing complex, high quality programs is beyond our ability.

Page 5

The bottom of page two contains four reasons why DTACK GROUNDED will not succeed. We have chosen to censor reasons 3 and 4. Reason three involved a wild guess about future product plans of the Tandy corporation (possibly influenced by an item on page 34 of the Nov. 3 '81 issue of ELECTRONICS magazine). We retain the right to make the wild guesses around here! Reason four describes a future product of the Apple Computer Company and we have censored THAT because it might NOT be a wild guess, considering the postmark. And it might upset the folks at Apple Computer if we printed it.

This letter was discussed during the same phone call mentioned earlier. When we mentioned that the writer had no basis for reason number 1 since newsletter #4 did not contain price information, our call (who had also only read #4) replied, "Oh, yes it did! The price was stated to be $1000". Oops! The $1000 was a general average of the price of the board, which varies over a 3-1 range depending on the amount of memory. If four large companies introduce Apple II compatible 68000 boards for a lot less than ours, there are going to be some AWFULLY cheap 68000 boards available!

Finally, our board DOES violate Apple II peripheral power specs if loaded with more than the minimum memory, as mentioned clearly in earlier issue, of this newsletter. But since when has exceeding the power specs been a basis for the business failure of an Apple II peripheral?

Allowing for the fact that the writer had not read the first three newsletters and so was missing some critical information, we can't find any disagreement.

As you can see, we DO answer our correspondence. However, we really prefer that you SIGN any letters you send us, and include a return address.

Page 6

Dear DTACK Grounded,

I read issue #4 of your newsletter with interest. I was disappointed that you did not describe your 68000 board (for the Apple II) in greater detail. How fast is it clocked? Can it access Apple II RAM (like the graphics screen) & with what bandwidth? Can the 6502 access the local 68000 RAM? Will you supply a 6502 based I/O subsystem (how will you read the disk, for example). Is DTACK really grounded?

I thought your newsletter spent far too much time rationalizing your reasons for existing. It got a little embarrassing after a while, especially since a lot of your reasoning is really off the wall. For example, your arguments for transporting application software are stupid -- most applications are written in high level languages and are thus processor independent -- its probably easier to port an application written in Pascal between a Z8000 and a 68000 system with similar terminals, disks, & RAM capacities than between a 64K 68000 system and a 256K one.

Page 7

I also think you're missing the point of the new generation of microprocessors. You are not suggesting anything new to do with the 68000 -- you seem to think the new generation of processors will run the same kind of software as the old, only faster. Most software on the market is incredibly hard for a non-computer oriented person to use. The real revolution (& accompanying sales) will come when the power of the 68000 is used to make systems that are not an order of magnitude faster but rather are an order of magnitude easier to use. (Have you seen the Xerox Star? )

Finally, I your business will not succeed for the following reasons:

(1) There will probably be at least 4 68000 boards available for the Apple II within a year, some by major companies and selling for a lot less than yours.
(2) Your board sounds like it violates Apple II peripheral power specs (maybe by a factor of 3?)
(3)
(4)

Page 8

USEFUL APPLICATIONS OF THE MOTOROLA 68000: The Oct. '81 issue of COMPUTE contains an excellent article explaining the scaling problems and roundoff errors that can occur while inverting a matrix. The author then provided one method of minimizing these errors while programming in BASIC. This article is highly recommended and indeed forms the basis for the subject matter to follow.

Using the methods outlined in that article, and using a conventional 8 bit processor running BASIC, 40 to 50 seconds were required to invert a 10 X 10 matrix. We will outline a method which also minimizes roundoff errors but will invert a 10 X 10 matrix in about 0.23 seconds. That is roughly 200 times faster. We are told that the APL package provided with Commodore's new Superpet (who is the genius who came up with that name?) will invert the same size matrix in ten seconds, but we do not know whether the algorithm used minimizes scaling problems and roundoff errors. Anyhow, the following method will be 48X faster.

We will adopt a higher precision 'temporary real' floating point format for the duration of the matrix inversion to fight the scaling and roundoff problems. Instead of a 32 bit mantissa we will use 48 bits. Instead of an 8 bit exponent we will use 14 bits. At the conclusion of the inversion, the numbers will be converted back to the usual Pet/CBM and Apple II floating point format.

The algorithm used in the COMPUTE article used twice the minimum required memory since a duplicate matrix was used. There is an "in-place" version of the Gauss-Jordan algorithm where the rows and then the columns of the matrix elements are rotated after each pass. We will use an improvement on this technique by maintaining pointers to the rows and the columns, and rotating the pointers while leaving the elements of the matrix in their original location. This greatly reduces data movement.

We will further reduce data movement by writing the floating point multiply routine so that FPACC#1 remains unchanged (and in fact located in the register space of the 68000 to minimize access time). FPACC#2 will simply be the particular matrix element to be multiplied, as located via the row and column pointers. Therefore, there are no temporary F.P. registers at all and no waste data movement.

The following code calculates the memory location of matrix element Aij and places that address in address register A2:

MOVE .L (A0), A2      REM A0 IS POINTER TO ROW POINTER
ADDA .L (A1)+, A2     REM A1 IS POINTER TO COLUMN POINTER

Note that the pointer to the row pointer is NOT incremented after loading (the next matrix element will be from the same row unless this is the last element of the row). The pointer to the column pointer IS incremented to be ready for the next element of the row.

The total time required to multiply one element (element not zero) is 107 usec or 4.25 usec for a zero element. Allowing 85 usec to perform a floating point addition, the actual computation time for a 10 X 10 matrix inversion with NO zero elements will be about 0.21 seconds. A small additional time will be required to transform the 9 decimal digit precision numbers into 14 decimal digit 'temporary real' format and vice versa. A total of 0.23 seconds looks about right.

Of that 107 usec, 2.25 usec was required to calculate the location of the element and another 8.25 usec to fetch and store the data. Therefore, the actual matrix element multiplication requires 96.5 usec.

SOURCE CODE: We are still developing this algorithm; the floating point add routine has not yet been written, for example. However, the source code of the completed routine will be published (perhaps in REDLANDS) when the coding has been completed.

LARGER MATRICES: If we are looking at a larger matrix, say 40 X 40, an inversion with our 68000 board will take LESS than 15 SECONDS, the 6809-based APL in the Superpet will take MORE than 10 MINUTES (Bob, are you SURE you want to back the 6809?) and the code listed in the COMPUTE article would require nearly an hour, all to do the same job (remember, we aren't sure about the scaling and roundoff characteristics of the 6809 APL package).

Inverting large matrices is something that has been restricted to timesharing with large computers in the past. Those of you who are unfamiliar with matrices probably think this is an esoteric application seldom encountered In real life. WRONG! Matrix inversion is used to solve large systems of linear equations, analyze the behaviour of complex electronic circuitry (our particular area of interest) plus many other very real applications. The ONLY reason you haven't run into this much in the personal computer field is that people won't sit around for hours or days waiting or answers (time is money, yes?). So only the time-sharing freaks have been using matrix inversion up to now.

Page 9

If you are wondering how we got from 10 minutes for a SINGLE inversion to 'hours and days', it is for the same reason that the time to perform a SINGLE addition doesn't represent the response time of an electronic spreadsheet program. To determine the amplitude and phase of an electronic circuit versus frequency, fifty or so inversions are needed across the frequency spectrum. Then, when the response is evaluated and a component or two is changed, the whole thing is repeated. This IS a practical application for a 68000. It is IN NO WAY practical for a 6502 or 6809.

MORE EMBARRASSING, OFF THE WALL, STUPID RATIONALIZATIONS: Every now and then, we print something in this newsletter that turns out to be true. In the last newsletter, we stated that many of the programs that would be available for the new IBM personal computer would be 8080 programs running at 8080 speeds. Take, for example, the best known of the several electronic spreadsheet programs. This program is written in machine code and, unlike most such programs, uses its own floating point routines rather than call the routines in the host's BASIC ROM. This program is available for the IBM machine. The question is: Is the machine code in question 8080 code or 8088 code?

If we could see a speed comparison between the IBM machine and one of the standard 8080/Z80/8085 personal computers, we would have the answer because 8088 machine code would be MUCH faster. As it happens, just such a comparison has been published in the DEC '81 issue of CREATIVE COMPUTING (see pages 37 and 38). A specific application was tested on both the IBM machine and on a TRS 80 II. The TRS 80 won easily!

So, like we said before, if you like running 8080 code slowly, be SURE to buy the IBM personal computer (we would like to thank Dr. J.S.K. for calling our attention to that comparison).

In fact, you will shortly be able to run 8080 code slowly ON YOUR APPLE since (rumor has it) an 8088 processor board is forthcoming. Does anyone know where we can get a 137 slot Apple II? Is anyone working on a TMS1000 processor card for the Apple?

SO YOU WANT TO BE A NEWSLETTER EDITOR? How does one write (some) 68000 software, design not one but TWO 68000 attached processors, keep another business going to pay the bills and still find time to write a newsletter? Hint: this page is being written on Thanksgiving Day.

MORE Z8000 STUFF: At that, this newsletter writer is better off than the 139 employees of Advanced Micro Computers who were just fired. AMC (a subsidiary of AMD) built board and system level computers using the AMD version of the Z8000. AMD is the outfit we wrote about in the last issue which dropped the Z8000 and took on the task of second sourcing the 8086. Because AMD is no longer particularly interested in the Z8000 and because AMC was a money-losing operation, the subsidiary was closed and the employees were fired.

Page 10

ALL HEART DEPT: Jerry Sanders, the president of AMD, had pledged at the start of the current recession that no employees would be laid off. Jerry (according to press reports) took particular care to point out that the 139 ex-AMC employees were NOT laid off. They were FIRED. We can be sure that these ex-employees and their families will be comforted by the distinction on this day of Thanksgiving.

EVEN MORE Z8000 STUFF (or maybe even more EOTWSR): We pointed out last issue that the Zilog had lost its domestic second source for the Z8000 and that the company is (and always has been) highly unprofitable. Since then, Exxon has begun to buy out the minority Zilog stockholders so that it will own 100% of Zilog instead of the 90% it held until recently. When this has been completed, Exxon has announced that Zilog will become a subsidiary of another Exxon company.

Unless an economic miracle occurs, Zilog will then be a highly unprofitable subsidiary. Now, let's see, what often happens to money-losing operations which are subsidiaries? Do you suppose Jerry Sanders could cast a light on that subject? Do YOU want to start developing Z8000-based products now? How far off the mark was the last issue of DTACK GROUNDED which stated that the Z8000 had just crashed and burned?

ANOTHER USEFUL APPLICATION OF THE MOTOROLA 68000 (WITH EXPANSION BOARD): Professional CAD graphics application packages have in the past generally run about 15 to 30 thousand dollars for the software and 300 to 500 thousand (!) for the hardware. One vendor of a very sophisticated package has adapted his existing package (which runs on BIG, EXPENSIVE equipment) for both the Apple II and III. The adaptation has been completed and beta site evaluation has begun.

The package seems to run well with the reservation that some "frames" of data run up to 64K bytes. This means the floppy disk drive spends, in some cases, a lot of time looking for a particular datum. In addition, the search time for certain complex tables takes a while longer than is comfortable. One can always wait a while if one saves several hundred thousand dollars (!!), but it would be nice to save the money and not have to wait.

Now, lets see: with a full up 68000 board and a 128K expansion board we have a 220K byte 68000 system that will work with both the Apple II and III. That's enough memory to hold the entire frame of data plus an assembly language program (written using an Apple Pascal cross-assembler) to perform the data search and manipulation. Would this be a useful optional accessory for the CAD graphics house's customers? They must think so, because they have placed an initial order for TWO fully loaded Apple compatible 68000 boards. And they are asking pointed questions about delivery of expansion boards.

WAIT JUST A COTTON PICKING MINUTE!! DOES THAT MEAN YOU ARE GOING TO SELL YOUR BOARDS TO WORK WITH THE APPLE III?

Of course. Our Apple compatible boards work with the Apple III just as well as the II.

BUT WE THOUGHT YOU WERE CARRYING ON A JIHAD AGAINST THE APPLE III!

Perish the thought. Our targeted market is those people who ALREADY OWN suitable host equipment. For someone who has already purchased an Apple III, the cost of our board is EXACTLY THE SAME as for someone who has purchased an Apple II.

Our reason for poking a little fun at Apple in newsletters #1 and #4 was to let our readers know that the executives at Apple Computer are mortal and make mistakes, even as you and we. And since we and Apple are both going to be trying to sell 68000 systems to YOU we want to at least start even.

Page 11

EXPANSION BOARD INFO: It turns out that 128K RAM plus the address and data buffers will occupy a board ALMOST as large as our 68000 board. If we make the two boards the same size we are going to have some space left ever. Perhaps we could place a nice pink marble statuette of a naked lady with a clock for a navel in the open space? Hmm. No, that won't work. Let's see. CLOCK!! THAT'S IT!! We will put the same real time clock that we use in our noise monitor (the one that works, remember?) in the open space.

That way Pet/CBM users, Apple II users and even Apple III users can have a REAL TIME CLOCK THAT WORKS. And since many users will want to buy the expansion board just for the real time clock, we will sell it with 64 unpopulated RAM locations for just $200 (its a BIG board, with LOTS of holes). Just promise (cross your heart) not to cheat us by populating any of those empty RAM locations

Q & A #1: We have been asked by several people how much we paid Jeff Mazur to write the article on page 174 in Nov. SOFTALK. We gave him 3 free newsletters!

Q & A #2: To answer a question asked by several owners of Commodore computers, we make both Commodore compatible 68000 boards and FULLY Commodore compatible boards. The FULLY compatible boards are the ones which have big holes drilled in the unpopulated RAM locations. Commodore owners should be careful to specify 'Commodore compatible' or 'FULLY Commodore compatible' when ordering 68000 boards.

THE 68000 IS A REALLY NEAT DEVICE #1: We did not mention this in the earlier writeup on inverting a matrix, but that double precision FP multiply is an outstanding example of the large register space of the 68000. The ONLY memory references were two to locate Aij, four to load ONE of the operands and four more to store the result. BOTH operands and ALL intermediate results were completely contained within the 68000 register space.

Any other, lesser, microprocessor (and all the others ARE lesser) would be CONSTANTLY CHURNING BACK AND FORTH TO MEMORY with the intermediate results. This churning back and forth to memory contributes NOTHING to the computation but it DOES eat up a lot of time. It is the 64 bytes of internal register space which makes the very fast double precision multiply possible (32 bit registers don't hurt the speed, either).

THE 68000 IS A REALLY NEAT DEVICE #2: We started out here at (parent) Digital Acoustics building smart instruments with the Intel 4040, which at that time was the world's best microprocessor. So we know how to write assembly code for the 4040, and we HAVE written plenty of working commercial code for that device.

Then the 6502 came along and became the new "world's best microprocessor". We have written LOTS of assembly language code for that device. Since we have programmed both the 4040 and the 6502, we can tell you in all sincerity that there is absolutely no comparison whatever In the ease of programming a particular application between the two. The 6502 wins hands down.

Now the 68000 is here and it is the EVEN NEWER "world's best microprocessor". We have written enough assembly language code ALREADY for the 68000 that we can tell you in all sincerity that there is absolutely no comparison whatever in the ease of programming a particular application between it and the 6502. The 68000 wins hands down.

We had thought that it would be obvious that microprocessors with greater resources and more sophisticated instruction sets are automatically easier to program. However, many users of eight bit machines who have never programmed anything else apparently feel that a device like the 68000 is MUCH more complex (true) and MUCH harder to program (false).

Page 12

BEGINNING AT THE BEGINNING: It is ALWAYS difficult to start programming ANY new machine in assembly language, mostly because of the need to memorize the assembly language syntax. That's why we have written a menu-driven program which, given a particular mnemonic such as "ASL" will walk you through all the options for that instruction.

For ASL, the first option is, are we shifting a data register or memory? If register, the second choice is the operand size: byte, word or long word? We choose long word and then we are given the choice of shifting a particular number of bit positions from 1 to 8 or of shifting by the number contained in another data register. We choose 1 thru 8. What number? We choose 1. What data register? We choose register 3.

This completes the selection process for that particular instruction. What we are left with at the top of the screen is "ASL .L #1, D3" which is the Motorola compatible assembly syntax, plus "E3 83" which is the hexadecimal code for the instruction. Incidentally, the Motorola compatible assembly syntax builds as the selections are made. Therefore, it is a useful program to TEACH Motorola 68000 assembly language syntax.

GOUGEM & CHEATEM DEPT: How much will we sell this program for? We have placed it COMPLETELY in the public domain! A group in Canada, headed by Les Titze, has translated this program into Apple II format, improved it and added an option to output each line to a printer. Les has generously left this improved Apple II version of the program in the public domain.

THE RED CROSS WE AIN'T: We provide this program, on unprotected disk, to each DTACK GROUNDED customer. We encourage them to share the program with their friends. WE ARE NOT, REPEAT NOT, VOLUNTEERING TO SERVE AS A DUPLICATION AND DISTRIBUTION CENTER!! Requests for copies of the program will be ignored (would you like to buy a board?).

ABOUT THAT TRASH 80 IV: According to Electronics magazine (a McGraw-Hill publication), the long-awaited 16 bit computer from Tandy will have a 68000 CPU instead of the 8086 which most of us expected. Could it be that Motorola is willing to unload EVEN MORE of those 4MHz parts at bargain prices? Is it possible the machine will use a software refresh scheme for its dynamic RAM?

After the kid next door brings his TRASH 80 IV home, is he going to JUST DIE LAUGHING at your Apple II or Pet after comparing benchmarks, even at 4MHz?

STRINGS AND THINGS: You may have noticed that we have not, up to now, mentioned much about string handling with the 68000. One reason is that Microsoft BASIC strings work VERY, VERY differently than the Wang BASIC with which we are familiar. Although we took an instant dislike to the Microsoft string handling, we held off on the assumption that it was just the unfamiliarity which caused our dislike, and that we would warm up to the Microsoft strings with experience. WRONG! We have now decided that the string handling in the Pet/CBM (and we assume in the Apple also) is something that seriously needs fixing.

HARDWARE STATUS REPORT: You have maybe noticed the front cover of this issue? The production Apple II compatible boards are working fine and, believe it or not, without any cuts or jumpers. EVERYBODY WHO DOUBTED THAT WE COULD REALLY PRODUCE WORKING 68000 BOARDS FOR THE PET AND APPLE GO STAND IN THE CORNER!!

Layout of the 128K expansion board is (barely) under way and we are opening discussions with an OSI user about putting the 68000 board onto OSI systems. We have tentatively made an offer to work with TRS 80 II users to place the 68000 on that system (we are absolutely serious about eventually attaching our simple 68000 board to ALL of the suitable host processors).

Page 13

We have identified four versions of the Pet/CBM so far. Version 1 has a printed circuit edge connector for the memory expansion port, and we can't tie our 68000 board to that version at this time because of connector incompatibility.

Version 2 has two vertical connectors in a straight line. We can work with this unit with either BASIC 3.0 or 4.0, and NO cuts or jumpers are required since $8900-8FFF is decoded to the memory expansion connecter. That "$8900-8FFF" is NOT a misprint.

Version 3 is the 8032 with assembly number 8032080 (the assembly number is located just behind the beeper). This is what we have, and one cut plus a 'dead bug' is needed to decode $8800-8FFF to the expansion connector. Also, the two expansion connectors are staggered.

Version 4 is the 8032 OR "fat 40" with assembly number 8032080, again behind the beeper. The memory expansion port already is decoded for $8800-$8FFF (good) but the BASIC ROMS are soldered in (terrible). If you have a version 4, don't buy a 68000 board from us yet, unless you are up to tackling the problem of unsoldering and socketing those ROMS. Eventually we will have a RAM card which overlays these ROMS, but that is several months off.

SOFTWARE STATUS REPORT: We have identified the 'hooks' to the BASIC floating point routines for the Pet 4016/32 BASIC 3.0 and are currently in the process of doing this for the Apple II. Also, we need to convert our 6502 utilities (such as the host side of the monitor) to work with the Apple.

Surprisingly, the apparently simple task of moving 4K bytes of binary code from our Wang 2200 to the Apple Is proving difficult. We use 2532 EPROMS in our environmental noise monitors, and these happen by coincidence to be compatible with the 4K "USER ROM" sockets in the Pet/CBM machines. Apparat makes a 2532 EPROM board for the Apple II, but it is in short supply.

We have just tried, for the first time, using our 68000 board in conjunction with a BASIC compiler. As we had anticipated (hoped?), no modification whatever was necessary since the compiler called the floating point routines In the BASIC ROM, and we had already 'hooked' these routines into the 68000.

The compiler in question was the DTL compiler for the CBM 8032/8050. We borrowed this compiler for one afternoon. We have one of these on order for our permanent use, and will provide an extensive report on this in our next issue. Our brief test indicates that the promised (by us) improvements are REAL. You aren't SURPRISED, we hope?

ACKNOWLEDGEMENTS: Apple; singular, II and III are trademarks of the Apple Computer Co. Pet and CBM are trademarks of Commodore Business Machines. TRS 80 II is a trademark of the Tandy Corp. and DTACK GROUNDED is, you had better believe it, OUR trademark. Nobody has claimed TRASH 80 IV yet, but stick around.

SUBSCRIPTIONS: $15/6 issues U.S. and Canada, $25 U.K. or Germany. Payment should be made to DTACK GROUNDED. The subscription will start with the first issue unless otherwise specified. The address is:

DTACK GROUNDED
1415 E. McFADDEN, St. F
SANTA ANA CA 92705

If you received this newsletter as a free sample or via photocopy discount, this is the end. For subscribers, the next four RED pages continue the discussion of our floating point package covering the normalization and adjustment of the exponents for the 68000 multiply and divide routines.

Page 14

Issue #1 covered the integer portion of the floating point multiplication routine right up to the point at which normalization might be required. This issue continues the discussion which ended with paragraph 2, page 11.

The exponent is traditionally represented in floating point packages as an unsigned number. This is accomplished by adding a fixed constant to the exponent. In Microsoft Pet/Apple floating point, the exponent is an 8 bit number which represents powers of 2 from -127 to +127. The astute observer will note that only 255 of the possible 256 values of an 8 bit number are required for that range. The reason is that the value zero is reserved to flag the floating point number zero. If the exponent is zero, it is not necessary to examine the mantissa to determine the value of the number (indeed, the mantissa is then irrelevant).

The fixed value which is added to the exponent is #128 or $80. Since the mantissa ranges between exactly 1/2 and not quite 1, the following are sample representations of particular numbers:

   NO.        EXPANDED FORMAT        COMPRESSED FORMAT
   ---        ---------------        -----------------
          sign exp    mant  guard       exp mantissa
  0.25      0  $7F  $800000  $00        $77 $000000
  0.50      0  $80  $800000  $00        $80 $000000
  1         0  $81  $800000  $00        $81 $000000
 -1         1  $81  $800000  $00        $81 $800000
 10         0  $84  $A00000  $00        $84 $200000

In a properly normalized floating point number, the most significant bit of the mantissa is always a 1. Microsoft takes advantage of this fact to get a 'free' storage space for the sign in the compressed format. In the expanded format (which is what we find in the floating point accumulator) the sign bit occupies an entire byte of its own. The sign bit resides in the most significant bit of that byte, and the remaining seven bits are irrelevant. In fact, some very strange things happen to those seven bits in the Microsoft 6502 F.P. package.

We will concern ourselves here with the EXPANDED format exclusively.

As can be seen above, the number "one" is actually represented as 1/2 (the mantissa) times two raised to the first power (the exponent). When multiplying, we add the exponents. Since this gives us an exponent with "excess 256" rather than "excess 128", we then subtract #128 or $80 from this result. If the result of the integer multiplication is unnormalized, we must shift the result of the multiplication (the mantissa of the result) one bit left and subtract one from the exponent to compensate.

We can agree that one times one equals one. Let us perform this multiplication using the procedure of the last paragraph:

After adding $81 to $81 and then subtracting $80, we get $82. After integer multiplying $800000 times $800000, the result is $400000000000. Since the most significant bit of the result is a zero, we shift the most significant 4 bytes 1 bit left, resulting in $80000000 (the last byte is the guard byte). Now we have to subtract one from the exponent to compensate for this shift. $82 - $01 = $81. Therefore, we wind up with the standard representation of "one" as the result. Surely, this is the desired outcome!

Page 15

The preceding example ignored the guard byte portion of the integer multiplication and also the sign. This was done on purpose to concentrate on the calculation of the exponent portion of the result. The accompanying 68000 code performs these operations, plus the calculation of the sign.

The sign of a positive number times a positive number is positive; the sign of a negative number times a negative number is also positive. Only when the two numbers have different signs will the result be negative. You Boolean experts will recognize this as the "exclusive or" operation. Therefore, calculation of the sign of the result involves "EOR"ing the two signs with the result placed in the sign of FPACC#1.

When dividing, the sign is calculated in exactly the same way.

When the Microsoft Pet/Apple floating point package performs a divide, FPACC#1 is first rounded and then divided into FPACC#2. Rounding is accomplished by adding the most significant bit of the guard byte to the mantissa. The guard byte is then assumed to be zero (it is ignored).

The exponent calculation is done by SUBTRACTING the exponent of FPACC#1 (X1) from the exponent of FPACC#2 (X2). Since this cancels the "excess 128", we then have to add #128 to the result.

Since we are dividing by a number less than one (and as small as 1/2), the result of the integer division may be one or greater. In that case, the integer result (the mantissa) must be shifted 1 bit RIGHT to properly normalize the mantissa of the result. We then ADD 1 to the exponent to correct for this normalization. In the particular algorithm used in the 68000 F.P. package, address register 1 (A1) is used as a flag to indicate whether this normalization is necessary.

If either of the two operands is a zero, the result of a multiply or divide is uniquely determined without reference to the mantissa of either number. Therefore, the multiply routine must test whether either number is zero; if either is the result is zero. When dividing, if FPACC#1 is zero, an illegal attempt to divide by zero must be reported. This will cause any BASIC program which may be running to be stopped. If FPACC#2 is zero, the result is zero.

These tests for zero are trivial and are not included in the accompanying code.

Our algorithms must also beware of overflow and underflow. Overflow occurs when the result of a calculation is such a large number that the exponent is greater than +127. This is a fatal error similar to division by zero. The error must be reported and execution of the BASIC program stopped when this occurs.

Underflow occurs when the result of a calculation is such a small number that the exponent is less than -127. This means that the result approaches zero more closely than can be accurately represented by the floating point format. This is not considered an error and in fact we simply set the result to zero and continue with business as usual.

If FPACC#1 contains the very largest possible legal number and the guard byte is $80 or more, then the process of rounding FPACC#1 will result in an OVERFLOW of FPACC#1. Since this is the number we are DIVIDING BY, if this condition occurs we set the result to zero and proceed!

IN THE NEXT ISSUE OF REDLANDS: FLOATING POINT ADDITION, PART 1

Code Listing

                          1          OPT     P=68000,BRS,FRS
001000                    2          ORG     $001000
                          3
                          4 * CALCULATE THE EXPONENT OF THE RESULT
                          5
001000  1A38 1903         6 FPMUL1   MOVE.B  X1.W,D5        FETCH EXP1 TO D5
001004  67 EE             7          BEQ     RZER           ZERO IF EXP1= 0
001006  1C38 190B         8          MOVE.B  X2.W,D6        FETCH EXP2 TO D6
00100A  67 E8             9          BEQ     RZER           ZERO IF EXP2= 0
00100C  DC05             10          ADD.B   D5,D6          ADD EXPONENTS
00100E  64 04            11          BCC     UND80          SKIP IF CY= 0
001010  6B DE            12          BMI     OVFL           OVFL IF CY, D7= 1
001012  6A 02            13          BPL     OV80           EXP $80 TO $FF
001014  6A DE            14 UND80    BPL     RZER           EXP UNDERFLOW
001016  DC3C 0080        15 OV80     ADD.B   #128,D6        CORRECT EXP
00101A  67 D8            16          BEQ     RZER           BR IF EXP IS ZERO
                         17
                         18 * CALCULATE AND STORE THE SIGN OF THE RESULT
                         19
00101C  1A38 190A        20          MOVE.B  S2.W,D5
001020  BB38 1902        21          EOR.B   D5,S1.W
001024  6B 08            22          BMI     MULX           SKIP IF D31 = 1
                         23
                         24 * HERE IS WHERE THE INTEGER PORTION OF THE
                         25 * FLOATING POINT MULTIPLY IS PERFORMED
                         26
                         27 * NORMALIZE BY SHIFTING LEFT ONE BIT, X1= X1-1
                         28
001026  E348             29          LSL.W   #1,D0
001028  E395             30          ROXL.L  #1,D5
00102A  5306             31          SUBQ.B  #1,D6
00102C  67 2C            32          BEQ     DRZER
00102E  11C6 1903        33 MULX     MOVE.B  D6,X1.W
001032  21C5 1904        34          MOVE.L  D5,M1.W
001036  31C0 1908        35          MOVE.W  D0,G1.W
00103A  4E75             36          RTS
                         37
                         38 * START FLOATING POINT DIVIDE
                         39 * ROUND X1, M1, G1  (X1, M1 ARE IN DATA REGS)
                         40
00103C  E1F8 1908        41          ASL     G1.W           MSB OF GUARD BYTE TO CY
001040  64 0A            42          BCC     FPDIV2         SKIP IF NO CY
001042  5280             43          ADDQ.L  #1,D0          INCR MANT1
001044  64 06            44          BCC     FPDIV2         SKIP IF NO CY
                         45
                         46 * SET MANT1= $80000000 AND INCREMENT EXP1
                         47
001046  E290             48          ROXR.L  #1,D0
001048  5206             49          ADDQ.B  #1,D6
                         50
                         51 * THE RESULT IS ZERO IF EXP1 OVERFLOWS
                         52
00104A  67 0E            53          BEQ     DRZER          RZER ON X1 OVFL
                         54
                         55 * CALCULATE THE EXPONENT OF THE RESULT
                         56
00104C  9846             57 FPDIV2   SUB.W   D6,D4          16 BIT SUBTR
00104E  D87C 0080        58          ADD.W   #128,D4        CORRECT EXP
                         59 *        DO NOT TEST FOR OV/UNFL NOW
                         60
                         61 * CALCULATE AND STORE THE SIGN OF THE RESULT
                         62
001052  1C38 190A        63          MOVE.B  S2.W,D6
001056  BD38 1902        64          EOR.B   D6,S1.W
                         65
                         66 * HERE IS WHERE THE INTEGER DIVIDE IS DONE;
                         67 * AFTERWARDS NORMALIZE IF NECESSARY
                         68
00105A  3A09             69          MOVE.W  A1,D5
00105C  67 08            70          BEQ     DIVX
                         71
                         72 * THE RESULT IS EQUAL TO OR GREATER THAN ONE;
                         73 * NORMALIZE RIGHT & INCREMENT EXP1
                         74
00105E  E24D             75          LSR.W   #1,D5
001060  E293             76          ROXR.L  #1,D3
001062  E256             77          ROXR.W  #1,D6
001064  5244             78          ADDQ.W  #1,D4
                         79
                         80 * TEST EXPONENT FOR UNDERFLOW OR OVERFLOW AND
                         81 * RETURN EXP1, MANT1, GUARD BYTE TO FPACC#1
                         82
                         83 * END OF FLOATING POINT DIVIDE
                         84
   # 00000FF0            85 OVFL     EQU     $000FF0
   # 00000FF4            86 RZER     EQU     $000FF4
   # 00001066            87 DIVX     EQU     $001066
   # 0000105A            88 DRZER    EQU     $00105A
   # 00001902            89 S1       EQU     $001902
   # 00001903            90 X1       EQU     $001903
   # 00001904            91 M1       EQU     $001904
   # 00001908            92 G1       EQU     $001908
   # 0000190A            93 S2       EQU     $00190A
   # 0000190B            94 X2       EQU     $00190B

DTACK GROUNDED #5 December 1981