
CBD(5)                    FILE FORMATS                     CBD(5)

NAME
     cbd - vector map data format for mapbrowser and mapwriter

DESCRIPTION
     This is a file format for  compressed  binary  map  database
     (*.cbd) files, designed to encode features from a vector map
     database.  Map data in this format can be used by  the  map-
     browser  and  mapwriter programs to browse and render raster
     map images from the vector data.

     This format encodes data line graph (DLG) data in  latitude-
     longitude  coordinates  into a compressed format designed by
     Brian Reid and slightly modified  by  Steve  Putz.   It  was
     designed  to  efficiently  encode  map data such as from the
     World Data Bank II.  The cbd files used  by  mapbrowser  are
     7-9 times smaller than original uncompressed data files.

World Bank Database
     The CIA World Bank II database is available  from  the  U.S.
     Government and is in the public domain.  The map database is
     divided  into  continents  (North  America,  South  America,
     Europe,  Africa,  and  Asia),  and  within  continents it is
     divided into "Coastlines, Islands, and Lakes"  (cil  files),
     "Rivers"  (riv  files), "International political boundaries"
     (bdy files) and "National political boundaries" (pby files).
     Each  file  is divided into several thousand "segments", and
     each segment is divided into some number of "strokes".

     The CIA World Bank II original data is  a  series  of  COBOL
     records,  specifying  5,719,617  individual  vectors,  which
     occupies about 130 megabytes of disk space (This  format  is
     apparently  compatible  with  a package called GS-CAM).  The
     "cbd" files used by the mapbrowser  program  are  compressed
     binary  encodings  of  it, that collectively occupy about 15
     megabytes of disk space.  The "cbd" files are produced  from
     the original data by the ciamap program.

U.S. GeoData
     The Earth  Science  Information  Centers  (ESIC)  distribute
     digital  cartographic/geographic data files produced  by the
     U.S.  Geological Survey (USGS) as part of the National  Map-
     ping  Program.   1:2,000,000-scale  DLG  data for the United
     States is available on CD-ROM.  The "Graphic" format on  the
     CD-ROM  is  the  same  GS-CAM  format used by the World Bank
     Database.

Data File Format
     The .cbd encoding divides a database into several files.  In
     each file there is a header record, a series of segments and
     strokes, and  a  segment  index.   A  separate  mapset  file
     describes the relationships of the files in a database.

MAPBROWSER          Last change: 7 June 1993                    1

CBD(5)                    FILE FORMATS                     CBD(5)

File Header Format
     The original format had a 40 byte header consisting of  just
     the  first  five  integer  values shown below plus 20 unused
     bytes.  The modified format used by  mapbrowser  adds  eight
     additional integer fields for a total of 52 bytes.

     struct cbdhead {
          long  magic;        /* Magic number */
          long  dictaddr;     /* Offset of segment dictionary in file */
          long  segcount;     /* Number of segments in file */
          long  segsize;      /* Size of segment dictionary (bytes) */
          long  segmax;       /* Size of largest segment's strokes, (bytes/2) */
         /* the following apply to CBD_MAGIC2 only */
          long  maxlat,minlat,maxlong,minlong; /* Bounding box of map */
          long  features;     /* bits indicate feature "ranks" present */
          long  scale_shift;  /* bits to shift coordinate data */
          long  lat_offset;   /* lattitude offset */
          long  lng_offset;   /* longitude offset */
     };
     The first 4-bytes in  Brian's  original  cbd  files  contain
     0x20770002  (CBD_MAGIC).  For the extended format, the first
     word should contain 0x20770033 (CBD_MAGIC2).

     #define CBD_MAGIC      0x20770002
     #define CBD_HEADSIZE1  40                        /* size of old header */
     #define CBD_MAGIC2     0x20770033
     #define CBD_HEADSIZE2  (sizeof(struct cbdhead))

     In order to support efficient skipping of  an  entire  file,
     the extended format header includes coordinates defining the
     bounding box of all vectors in the file and a bit mask indi-
     cating  which  feature  codes  are present in the file.  The
     scale_shift value (if non-zero) is an exponent  for  scaling
     the   coordinate   data  by  a  power  of  two.  A  negative
     scale_shift allows coordinates to be represented at a  reso-
     lution  finer  than  integer  seconds of latitude/longitude.
     The latitude/longitude offsets are added to the vector  data
     after  scaling.   The  scale and offsets are also applied to
     the bounding box coordinates in the file header and  segment
     dictionary (I think).

Segment Header Format
     Each segment begins with a segment header, followed by  data
     compressed strokes.

     struct seghead {
          BIT32  orgx,orgy;  /* Origin of first stroke in segment */
          BIT32  id;         /* Segment identifier serial number */
          BIT16  nstrokes;   /* How many strokes in the segment follow */
     };

MAPBROWSER          Last change: 7 June 1993                    2

CBD(5)                    FILE FORMATS                     CBD(5)

     The  data-compression  scheme  uses   integer   seconds   to
     represent latitude/longitude coordinates (shifted and scaled
     as indicated in the file header), and stores each stroke  as
     a  [dx,dy]  from  the  previous point. If dy will fit into 8
     bits and dx into 7 bits, then the entire [dx,dy]  is  stored
     in  a  16-bit  field with bit 0x4000 turned on as a flag. If
     either value is too large for that  scheme,  then  both  are
     stored  as 32-bit values, with the 0x40000000 bit turned off
     (even in negative numbers) in the first of them.

     #define MAX8y      ((short) 0x7F)  /* largest signed 8-bit value */
     #define MAX8x      ((short) 0x3F)  /* largest signed 8-bit value */
     #define SHORTFLAG  0x4000          /* flag saying this is a short stroke */
     #define SHORTBYTE  0x40            /* flag saying this is a short stroke */

Segment Dictionary Format
     The segment dictionary sits on  the  end  of  the  file,  is
     pointed  to  by  the  dictaddr value in the file header, and
     points to the segment headers via the absaddr values.

     struct segdict {
          BIT32  segid;    /* Segment identifier serial number */
          BIT32  maxlat,minlat,maxlong,minlong; /* Bounding box of strokes */
          BIT32  absaddr;  /* Address in file of segment header */
          BIT16  nbytes;   /* # bytes of strokes that follow */
          BIT16  rank;     /* Type of feature this segment draws */
     };
     The bounding box  information  in  each  segment  dictionary
     entry  allows  clipped  segments to be quickly skipped.  The
     rank value is a positive  integer  feature  code  indicating
     which  map  feature type is represented by the segment.  The
     feature codes are normally defined in the  mapset  file  for
     the database.

Feature Codes
     The feature code (rank) is an indication of what  each  seg-
     ment depicts.  The assignments of codes can be different for
     different databases but must be consistent among  cbd  files
     within a database.  The following are the feature codes used
     in the World Bank II database:

     In "Boundary" files:
          01  Demarcated or delimited boundary
          02  Indefinite or in Dispute
          03  Other line of separation of soverignity on land

     In "Coast, Islands and Lakes" files:
          01  Coast, islands and lakes that appear on all maps
          02  Additional major islands and lakes
          03  Intermediate islands and lakes
          04  Minor islands and lakes
          06  Intermittent major lakes

MAPBROWSER          Last change: 7 June 1993                    3

CBD(5)                    FILE FORMATS                     CBD(5)

          07  Intermittent minor lakes
          08  Reefs
          09  Salt pans -- major
          10  Salt pans -- minor
          13  Ice Shelves -- major
          14  Ice Shelves -- minor
          15  Glaciers

     In "Rivers" files:
          01  Permanent major rivers
          02  Additional major rivers
          03  Additional rivers
          04  Minor rivers
          05  Double lined rivers
          06  Intermittent rivers -- major
          07  Intermittent rivers -- additional
          08  Intermittent rivers -- minor
          10  Major canals
          11  Canals of lesser importance
          12  Canals -- irrigation type

FILES
     cbdmap.h       Header file for compressed binary  map  data-
                    base (*.cbd) files.

     ciamap.c       Program for converting from GS-CAM format  to
                    cbd format.

     /import/mapbrowser
                    Location of mapbrowser software and  data  at
                    Xerox PARC.

BUGS
     The format  described  here  includes  extensions  of  Brian
     Reid's  original cbd format necessary to support features of
     the mapbrowser program.  Files in  this  format  are  unfor-
     tunately not compatible with Brian's netmap program.

AUTHOR
     Original format by Brian Reid <reid@pa.dec.com>
     Modified by Steve Putz <putz@parc.xerox.com>

SEE ALSO
     mapset(5), mapbrowser(1), mapwriter(1)

MAPBROWSER          Last change: 7 June 1993                    4

