Figuring out file formats

  • Thread starter Thread starter rmco2003
  • Start date Start date
Status
Not open for further replies.
R

rmco2003

Guest
There seems to be a few people on here that have done similar tasks for FF7, I'm working on a translation for another game and most of its menus are stored as graphics under the name SPR. It doesn't correlate to any of the known SPR formats on the web, and the most I could find is that there are patterns of 00 hex bytes at some offsets.

I'm not really asking anyone to spend their free time hacking these files for me, but could someone point me in the right direction to working out what these files are made up of?
 
Sorry thought Tech Related was for things only related to the FF series
 
Could you maybe upload 3 or 4 of those files (e.g. on Mirex's Trashbinâ„¢).

Learning how to decode file formats isn't that easy. Even if you're familiar with data structures, it takes some experience to know what you have to look for in that file.

As I said, upload it somewhere, and I'll take a look at it.

 - Alhexx
 
Psst! Alhexx! The topic is in the wrong place. See the move-button? :-)
 
There are two types of files, one contains sprite data under the label SPR and the other contains animation data under the label ANI (I'm guessing). Here's 3 SPR files with their ANI files, and 2 SPR files without ANI files.

http://bin.mypage.sk/FILES/Files.zip
 
Psst! Alhexx! The topic is in the wrong place. See the move-button?  :-)
As you wish, my master.  :-P

rmco2003:
Okay, I'll try to take a look at the files this evening.

 - Alhexx
 
Hi rmco2003
I have also copied this question to a different boards, if you don't mind, as they are specialized on file formats. You can find it here: http://ffab.mypage.sk/viewtopic.php?t=38

I was looking briefly through the data and looks like the .ANI files are quite empty (no images inside them, only some numbers, maybe animation times), and looks like that .SPR files contain multiple images inside.
 
Excellent :-D thanks for your help mirex, I originally thought each SPR file contained one image, but now that I think about it, it makes more sense to group images together if they have animations. I've been stuck on figuring these things out for a year now, it's about time something got done :wink:
 
Hi! I looked into your files and was able to extract some images. I posted one of them into Mirex's Trashbin (http://bin.mypage.sk/FILES/Test1.JPG). There are no colors because I didn't convert from 16bpp.

I can't continue now, but tomorrow night I'll have more time to post more details, so I'll try to quickly explain the way images are compressed in the SPR files. As an example, I'll take "npc_priz_card.spr".

Code: [Select]
Code:
(11 bytes) Signature. ("DCSPRITE10" 00)(4 bytes) Number of elements in the next 2 tables. (0x98) Maybe number of images.Start of table 1:    (4 bytes) Width for the first image? (0x00000140 = 320)    ... (Repeat for every element in the table. 0x98 times.)Start of table 2:    (4 bytes) Height for the first image? (0x000000F0 = 240)    ... (Repeat for every element in the table. 0x98 times.)(4 bytes) Multiply by 2 to get pointer to a second structure. (*)Array of pointers to images (0x98 elements):    (4 bytes) Multiply by 2 to get pointer to first image. (*)    ... (Repeat for every element in the table. 0x98 times.)Start of first image: (Note this is the base offset for the above pointers marked with *.)    ... (Here come the images.)Second structure: (I haven't looked too much into it, but it seems similar to the above structure.)

With the example file, the first image starts at offset 0x00000733 and that address is the base for the pointers.

It seem images start with F0 00 00 00 and are followed by 0x74 bytes with value 0x00. (This is for the few ones I looked into, so I may be wrong.)

After this come the compressed data. (For the first image in the example file it's offset 0x000007AB.)

Data compression is as follows:

Code: [Select]
Code:
Scan line #1:    (2 bytes) Number of sub-blocks for first scan line    Sub-block #1:        (2 bytes) Number of 0x0000 that should be injected.        (2 bytes) Number of half-words (each one a pixel in 16bpp format) that follow and are supposed to be injected as they are.        Pixels of sub-block:            (2 bytes) 16bpp color for the pixel.            ... (Repeat for every pixel.)    Sub-block #2:    Sub-block #3:    ... (Repeat for each sub-block.)Scan line #2:Scan line #3:... (Repeat for every scan line.)
That description doesn't guarantee that each scanline will be completed. The trick I used was to substract the total of pixels injected to the width (320 pixels) and inject that amount of pixels (with value 0x0000) to complete the scanline.


As an example (same file, same image):

Code: [Select]
Code:
Scan line #1: 02 00  Sub-block #1: 91 00-03 00-AA A3-AA A3-AA A3  Sub-block #2: 1B 00-04 00-AA A3-AA A3-AA A3-69 93  <Inject pixels to complete width.>Scan line #2: 02 00  Sub-block #1: 90 00-07 00-2B BC-AA A3-AA A3-AA A3-AA A3-E7 82-AA A3  Sub-block #2: 16 00-06 00-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3  <Inject pixels to complete width.>Scan line #3: 03 00  Sub-block #1: 90 00-08 00-8C CC-CD D4-AA A3-AA A3-AA A3-A9 A3-86 72-08 8B  Sub-block #2: 04 00-0C 00-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-A9 A3  Sub-block #3: 03 00-09 00-A9 A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-AA A3-28 8B  <Inject pixels to complete width.>...

Hope this helps. Anyway, I'll complete the docs when I have more time, but if it's not clear, feel free to ask. :)
 
Last edited:
awesome :-D now I just have to find someone to code a converter tool.. I only know AS level VB, not much to go on :wink:

Isn't it ironic that I spent a year of sifting through webpages trying to find out how this file format works, and I post here and it's done in a couple of days.. :-P
 
Last edited:
awesome now I just have to find someone to code a converter tool.. I only know AS level VB, not much to go on
I won't code a complete converter tool (as I'm looking into FF9 at the moment) but I can give you the code I'm writting to decompress the images from the SPR files when I'm done. (Even document it a bit. :wink:)

What I wanted to ask you is what version of VB do you know: 6.0 or .Net? I'm using C#, so the translation to VB .Net should be easy; and if it's 6.0 (or prior), that's what I mostly use at work.

Oh... just don't use the spec I gave above unless you are curious because it doesn't exactly apply to the smaller files you posted. (What I said about the 0x74 bytes with value 0x00 at the start of the images has a meaning.)


Isn't it ironic that I spent a year of sifting through webpages trying to find out how this file format works, and I post here and it's done in a couple of days..
Not really. In my experience, it's quite difficult to find the description of a file format on the Web. Yes, you'll find complete docs for most common file formats out there, but this seems to be a custom format.
In particular, games use a lot of custom formats to store data so unless there is interest in the game you won't find that info. But internally, games also use pretty common formats.

What I'm tring to say is that maybe SPR files are not common, but the way the images are compressed inside is a slight variation to a common compression algorithm for graphics (or data, really.)
 
I know Visual Basic 6, .NET's a bit confusing for me :-P

I think it's great you've helped me out even this far, but there's a couple of things I'm not sure how to do..

Once I recieve your code, would I be able to just reverse the procedure to generate working SPR files? Or would I be coding a recompression scheme from scratch?

I'm asking this because I'm not very familiar with converting between file formats, file headers, how things are stored, etc, and although I don't have a problem with coding my own routines, in fact I can't wait :wink:, I'd like to know that it's feasible to do so.
 
Last edited:
I know Visual Basic 6
No problem, I translated everything from C# to VB .Net but I'll be able to rewrite it into VB6 this weekend. (I forgot I didn't have VB6 at home.)

Once I recieve your code, would I be able to just reverse the procedure to generate working SPR files? Or would I be coding a recompression scheme from scratch?
No, the code I'm writting is to decompress the images because I thought of doing that to help you, or anyone else, understand the format. I could decide to write a recompressor, but don't count on that as I'm not too fond of coding. :D
I have no problem helping you do it or even write some pseudocode. (As the saying goes: "Give a man a fish; you have fed him for today. Teach a man to fish; and you have fed him for a lifetime". That, and that I really hate programming.)

It's not complicated; you just have to know where to find the elements and these files have few things inside. The compression mechanism implies you have to search for the repetition of a color of your choice.

I'm asking this because I'm not very familiar with converting between file formats, file headers, how things are stored, etc, and although I don't have a problem with coding my own routines, in fact I can't wait, I'd like to know that it's feasible to do so.
It's as simple as reading/writing/seeking inside binary files. The best thing you could do is learn the format, try to rewrite it in your own terms and do the coding.

I think it's possible to rebuild a SPR from a group of images even from scratch, and that the game accepts it as valid. If you keep the same width and height as the original image it should accept them.
I'm not 100% sure about this, because the game could be expecting "something" from the data. As an example, if the game allocates M bytes of memory to load the compressed data into a buffer and your compressed data is bigger, you'll end up overwriting memory you weren't supposed to overwrite.

The easiest way to test this is to get the biggest compressed image, decompress it and save it uncompressed (but encoded) to the same SPR; then run the game.


Now, I have something else to ask of you because I'm having some problems. I posted to Mirex's trashbin (http://bin.mypage.sk/FILES/SPR_img.zip) some images. I need you to tell me what the colors should be (or give me a link to a picture with them) as I'm not much of a gamer and don't recognize the game.


I included the VB .Net code I used to generate them just in case you want to look a bit. As I said, I'll rewrite it to VB6. The only public function there (last one) takes the source file, the "table" from where you want to take the image (at the moment 0 works fine), the image index, a color to substitute the compressed pixels and the destination file. It exports the specified image to a BMP file. The other functions are auxiliary and private.

I'm not sure BMP is the right format to export to, but as there are a lot of editors/converters you can use, I thought it was fine. I say this because there are transparency and masks that other formats support better. Oh, I convert the images from 16bpp to 24bpp. I think there are 16bpp BMPs, but I'm not sure about the kind of support they have.
(I took the info on BMP from Wikipedia.)

As is, the code has some bugs:
- Colors are wrong. (Or I'm not used to those new colorful games. :D)
- Some re-interpretation for colors may be needed as I'm really using 15bpp and discarding the 16th bit.
- Output BMP is flipped because I forgot BMPs start at the bottom and go up.
- Only works fine with "table" 0. I have some problems with "table" 1.
- VB .Net arrays suck. (I'm used to C and VB6 kinds of arrays.)


Maybe I overcomplicated things with the description I gave in a previous post, so here is a simplification (I hope).

The general structure for SPRs is:

Code: [Select]
Code:
- Signature.- Image count.- Widths for all the images. (Array from 0 To image_count-1)- Heights for all the images. (Array 0 To image_count-1)- Pointer to second table ("table 1").- First table ("table 0"):    - Pointers to images. (Array from 0 To image_count-1)    - Collection of compressed images. (image_count images.)- Pointer to end of secod table or third table?- Second table ("table 1"):    - Pointers to images. (Array from 0 To image_count-1)    - Collection of compressed images. (image_count images.)
And sometimes "table 1" data is missing. (I'm looking into it.)


Now, the trick to work with those files is to know where each element starts, its size and the encoding.

a) The signature seems to be the same in all the files you posted:
    - It starts at offset 0.
    - Its size is 11 bytes (including the 0x00).
    - The encoding doesn't matter, but in ASCII it's the string "DCSPRITE10" followed by the byte 00.
b) The image count tells the total number of images that are stored inside the SPR file.
    - It starts at offset 11. (Because the signature is fixed.)
    - Its size is 4 bytes.
    - It's the representation of a 32-bit integer.
c) The widths for the images are an array of integer with the widths in pixels.
    - It starts at offset 15.
    - Its size is '4*image_count' because each element is 4 bytes long.
    - Each element represents a 32-bit integer.
d) The heights for the images are an array of integers with the heights in pixels.
    - It starts at offset '15+4*image_count'.
    - Its size is '4*image_count' because each element is 4 bytes long.
    - Each element represents a 32-bit integer.
e) The pointer to the second table tells where in the file the pointers to the second group of compressed images start.
    - It starts at offset '15+4*image_count*2'.
    - Its size is 4 bytes.
    - It's the representation of a 32-bit integer.
      Note: This is not a direct offset in the file, but it has to be multiplied by 2 and added to a base offset instead. (* See below.)
f) The first table of pointers tells where each image from the first table can be found in the file.
    - It starts at offset '15+4*image_count*2+4'
    - Its size is '4*image_count' as each element is 4 bytes long.
    - Each element represents a 32-bit integer.
      Note: Each element is not a direct offset in the file, but they have to be multiplied by 2 and added to a base offset instead. (* See below.)
g) The collection of compressed images for the first table has all the images from the first table in compressed form.
    - It starts at offset '15+4*image_count*2+4+4*image_count'.
    - Its size is variable and it depends on the size of the images and the compression obtained.
    - Each element is a compressed image and is described somewhere else.
h) Not sure how to interpret the pointer to the end of the second table (or third table).
    - It's start is determined by the pointer explained in (e).
    - Its size is 4 bytes.
    - It's the representation of a 32-bit integer.
      Note: This is not a direct offset in the file, but it has to be multiplied by 2 and added to a base offset instead. (** See below.)
i) The first table of pointers tells where each image from the second table can be found in the file.
    - Its start is determined by adding 4 to the offset of (h).
    - Its size is '4*image_count' as each element is 4 bytes long.
    - Each element represents a 32-bit integer.
      Note: Each element is not a direct offset in the file, but they have to be multiplied by 2 and added to a base offset instead. (** See below.)
j) The collection of compressed images for the second table has all the images from the second table in compressed form.
    - It starts '4+4*image_count' bytes after the start of the array seen in (h).
    - Its size is variable and it depends on the size of the images  and the compression obtained.
    - Each element is a compressed image and is described somewhere else.

(*) As the size of the data inside the images is always a multiple of 2 bytes, the pointers seem to be expressed as half-words (16 bits) instead of bytes, so they have to be multiplied by 2.
    These pointers (multiplied by 2) are relative to a base offset. In the case of the pointers described in (e) and (f), the base offset is the start of (g).
(**) Similar to (*), but the base offset for the pointers in (h) and (i) is the start of (j).


I left out the explanation for the compression, but I'll explain it in some other post.

I think this explanation is more simple than the previous one (I may be wrong) and if you decide to make an editing tool for SPRs, I'll recommend you take your time to understand the general outline of the files. A pencil and paper, plus a hex-editor should help you a lot. :)
Try to find the start of all the components and then the blocks of data: you'll notice similar elements and such.


About file modification of SPRs, don't try to add new images to the existing files: just copy the data you can take from the original file (with the modifications necessary to create a valid SPR) and add/remove the data you want.
Another thing you can do is export all the images at once, modify them and rebuild the SPR. This is probably easier to do than writing functions for individual images, but you'll need more disk space at once.


-------------------------

Edit: Corrected the tables a little bit as I was missing a field.
 
Last edited:
Alright, here are the images with the correct colours, I couldn't get a front facing one of the prize one though:

1.png

3.png

4.png

2.png


[EDIT] pseudo code would be great, that would be very helpful :-P
 
Last edited:
OK, thanks. The colors look right now and I solved some of the bugs. I'll start the VB6 version tomorrow, so you can have it on monday.

No problem about the pseudo-code, but start thinking on how you want to work with the extract/import functions; that is, if you want to be able to export/import/delete/update one image at a time or if you want to dump all the data once, modify it and then rebuild the SPR. (Or some other thing.)
 
Well I think extracting all of the images then rebuilding the SPR would be better for what I'd be using the code for. When you mention fixing bugs, what exactly were they, and did they relate to your comments about the amount of 00's after the fiie signiture? I'm just wondering about how I'd be able to account for variable lengths of those bytes... Have you managed to access multiple tables rather than just the first one?

Also I was wondering if you solved the problem of some of the SPR files not having table 1 data?
 
I haven't posted any changes, but the things I fixed were:
- The colors.
- I can get images from both tables, without problems from the files you uploaded. The icons only use the first table (0). The sprites use the first table (0) to store the image with colors and the second table (1) to store the shadows. That means the widths and heights for each image are the same if you use table 0 or table 1.
- The amount of 00's is not after the signature. (Did I ever say so?) The signature includes only one 00 at the end (offset 0x0000000A).
- The thing I said about a lot of 00s was related to the image data, for which I haven't posted an updated reference, but the VB .Net version I posted had the solution. (See variable "height2".)
I said image data started with F0 00 00 00 and had 0x74 bytes with value 0x00 for "npc_priz_card.spr", but that is not the right interpretation of the data. The first 2 bytes of the image data are equal to its height and I think the real meaning is "number of compressed scanlines" but they are the same in the files you posted so I don't have a way to know. (So F0 00 = 240 is the number of rows you'll have to decode.) That's the reason the code shows a message box telling you the size for the height is not the same as "height2".
The other (now) 0x76 bytes with 00s are compressed scanlines. Every '00 00' is one scanline for which there are no sub-blocks, so the program has to jump to the start of the next scanline. They are trated as the other scanlines.
(So the real compressed data for the images starts 2 bytes after height2.)


Some additional notes:
- The compressed pixels seem to be only the ones that are invisible; so they define the mask for the sprite.
- Each table seems to refer to a layer in the drawing process of a sprite and there is the possibility that there are more layers in other files.

I'll give you an updated doc with the "final" format.

Remaining known bugs:
- Interpretation of "last table". It seems the pointer to the next table points to 00 00 00 00 when you are in the last table. (I'm almost sure, but I haven't added this to the code to test.)
- Flipped BMPs. That's trivial to solve and I'll do it for the VB6 version.


Additional note: All the offsets I've used are 0 based, but binary files are 1 based in VB6.
 
Alright then, I was just wondering if there would be any issues with any other files. Here's a link with 15 other SPR files, I know it's a lot but it should help to work out information about the first 2 bytes of the data.

Just to let you know these are much larger than the other files I posted, and they'll probably be the ones I'm editing.

[link removed]

#### MOD EDIT ####
Hi, this is your friendly neighborhood moderator. Discussion of formats are fine, screenshots are OK, and snippits of data is workable. However posting whole files can get us in trouble. Please don't.

ktanks ^_^

-Halkun

[EDIT]

Sorry about that, anyway I compiled your code with VB.NET express and extracting images from the files I posted works just fine, so it seems that your solution works fine.

[EDIT2]

There is one problem I noticed with extracting from attack information, namely it doesn't get anything from any table or image number I choose. Maybe these are the sprites without table data that you were talking about? Unless I'm remembering what you said wrong... I'd post one but I'm not sure if I'm allowed to (need confirmation halkun). If BMP is a bad format to convert to for transparency, etc, then I'd gladly accept another format, I'd be using Paint Shop Pro X to edit these files, and that program can work with a large amount of file formats, so I'd go for 100% compatibility with the data rather than a common file format.

Once again thanks a lot for your help, thanks to your .NET code (I installed .NET express on another computer) I've been able to create a list of all of the files I need to edit.

Also a final note, with some of the newer graphics, there is partial transparency (can't remember the exact term), sort of translucent, and the border around it completely transparent. It seems to use the same non-existant colour, I noticed this is a pure white, since the translucent colour is white. I noticed all of this when I extracted these menus and compared them to in-game screenshots.
 
Last edited:
I think I finished the program. I left it in Mirex's trashbin (http://bin.mypage.sk/FILES/SPRExport.zip). It has all the bugs I could find, fixed. (I deleted the previous files, as there's no sense in keeping them.)

The structure of SPRs seems to be complete now (there weren't any changes) so I'll start with the "final" document. I tested all the files you posted and the program works fine.
I corrected a lot of bugs in the BMP generation, as the code I previously posted worked fine with images that had a number of columns divisible by 4 but had serious problems for other images.

I ended up assuming the first 2 bytes of the image data were the number of scanlines but left the message box to mark possible differences with the height. Anyway, I think it's unlikely you'll find files with any difference in those fields, because all images must have been compressed with the same tool and the only motive I see to make the numbers different is to compress a little bit more; but the compression only is used to encode the mask, not to save space.

I also took into account the possibility that there are more than 2 tables (layers) and the program assumes they have the same structure as tables 0 and 1.

The only function you'll find to work with SPRs is the one that exports one of them to BMP. If you want, you can add functions to get the number of tables, number of images, or anything else; just take the export function as an example.

The export function is SLOOOOOWWWWW because it doesn't have any kind of optimization: data is read/written byte by byte, function calls are used inside the loops and it's written in VB :D. It isn't a model for good programming practices, but you have a lot of comments. (More than I'm used to.)

Be very careful with the data types. I had to use some tricks to avoid the automatic type casts that VB6 does. Try to define all the integer variables as Long instead of Integer and be careful when you convert data to Long or define constants, because if VB converts them to Integer first, the numbers between 32768 and 65535 can be converted to negative numbers without you knowing. (VB isn't suited for this kind of job.)

You'll have to modify the examples in Form1 to your needs. Just don't uncomment all the lines at once or you'll get bored of waiting.

One last thing... when choosing the mask color to use, you are free to use any color, but if you choose a color with one of the following properties, you can be sure the mask won't conflict with any color the image has:
- Any component of red not divisible by 8.
- Any component of green not divisible by 4.
- Any component of blue not divisible by 8.
This is valid for the 24bpp representation (0-255 for each color) because when you convert from 16bpp to 24bpp you are not mapping to all possible colors.

Feel free to ask any question you have about the program.
 
Last edited:
Status
Not open for further replies.
Back
Top