The Only List of GIS File Formats You Need
In the field of Geographic Information Systems (GIS), data organization and representation are crucial for gaining valuable insights. Behind the scenes, there are various file formats that help us effectively manage and analyze geospatial information. Two primary formats, vector and raster data, play distinct roles in GIS. In this article, we will explore these GIS file formats and clarify the key differences between vector and raster data. Whether you’re new to GIS or seeking a refresher, this guide will provide a solid foundation to navigate the world of GIS file formats. Let’s dive in and uncover the essentials of vector and raster data.
What is GIS?
GIS is a computer-based tool that facilitates the capturing, manipulation, analysis, and design of spatial and geographic information.
The term GIS stands for Geographic Information Systems, where “Geographic” pertains to the physical location or position on the Earth’s surface. “Information” refers to data and details that can be utilized to represent and enhance the understanding of a specific location. Lastly, “Systems” pertains to a tool that allows multiple components to work together to achieve a common objective.
The GIS assists in comprehending complex data, in a geographic context, which ultimately helps in making well-informed decisions in various fields, such as urban planning, environmental management, and public health, among others.
To put it another way, GIS merges geographic data, such as maps, satellite imagery, and geospatial databases, with other forms of data, such as demographic, economic, and environmental data, to uncover patterns and trends in various phenomena.
What are GIS file formats and why are they important?
GIS file formats are specific file types used to store and manage spatial and geographic data in GIS software. There are numerous GIS file formats, and each of common file formats has its unique features and capabilities, which we are going to cover in this article later.
The availability of different GIS file formats is important for GIS software because it allows users to store and manage various types of spatial and geographic data in a flexible and efficient manner. Different GIS file formats are optimized for specific types of data, such as vector, or raster data.
Having a variety of GIS file formats also enables interoperability between different GIS software and systems, making it easier for users to share data and collaborate on projects. For example, users can import and export data between GIS software using compatible file formats, ensuring that data is accessible and usable across different platforms.
What is more, the choice of GIS file format can impact the performance and accuracy of GIS analysis. Some GIS file formats are designed to handle large geographic datasets, while others are better suited for storing detailed attribute information.
Choosing the appropriate GIS file format for a specific project can help ensure that the data is accurate, consistent, and easily accessible.
What is the most common GIS format?
The most common GIS file format is the Shapefile (.shp), which was developed by ESRI (Environmental Systems Research Institute) in the 1990s. Shapefiles are widely used for storing and managing vector data, such as points, lines, and polygons, and are compatible with most GIS software.
However, other GIS file formats have gained popularity in recent years, such as the Geodatabase (.gdb) format used by ESRI’s ArcGIS software and the GeoJSON (.json) format used for web-based mapping applications.
What is a difference between vector and raster data?
Vector and raster data are two common formats used in GIS to represent and analyze spatial information. Let’s see what is a difference between the two:
Vector data
Vector data refers to a type of spatial data representation that uses geometric shapes to represent geographic features on the Earth’s surface. Vector data is based on points, lines, and polygons and is widely used to represent discrete objects and features such as roads, buildings, rivers, and administrative boundaries.
Raster data
Raster data, is also known as grid data or surface data and represents the landscape using a rectangular matrix of square cells or pixels. Each pixel in a raster corresponds to a small area on the Earth’s surface and contains attribute information like height, temperature, or color.
There are three main types of raster data: thematic data, spectral data, and pictures.
Thematic data represents categorical information, such as land cover classes or vegetation types. Spectral data represents continuous variables, like reflectance values from remote sensing imagery. Pictures, as the name suggests, represent visual images captured by satellites or other imaging devices.
In a nutshell, vector data uses points, lines, and polygons to represent discrete objects, while raster data uses a grid of cells to represent continuous phenomena.
List of GIS file types
Now that we understand the distinction between raster and vector data, let’s delve into the various types of GIS files and examine the most common ones.
Vector GIS file formats
Vector GIS file formats are file types used to store and manage vector data in GIS software. It is the most common type of GIS data.
As we have already learned, vector data consists of points, lines, and polygons that represent geographic features such as roads, buildings, and water bodies. Vector GIS file formats are optimized for storing and managing this type of data and can contain various attributes associated with each feature.
Some of the most common vector GIS file formats include
- Shapefile (.shp): Shapefiles consist of several files, including a main .shp file that contains the geometry of the features, a .dbf file that contains attribute information, and a .shx file that indexes the geometry. This format has been in use for many years, and its widespread adoption has made it the de facto standard for GIS data interchange.
- .shx: index files associated with shapefiles. They store index information that helps in faster access and retrieval of spatial data stored in the corresponding shapefile. SHX files work in conjunction with the main .SHP file to enhance performance in GIS applications.
- .dbf: dBASE format files commonly used in GIS for storing attribute data. They store tabular data, such as attribute values, associated with spatial features in a shapefile.
- .prj: PRJ files are projection files that define the coordinate system and spatial reference information for a GIS dataset.
- Geodatabase (.gdb): A proprietary file format used by ESRI’s ArcGIS software to store vector and raster data along with attribute information.
- GeoJSON (.json): A popular file format used for web-based mapping applications that stores vector data in a format that can be easily read and manipulated by web browsers.
- Keyhole Markup Language (.kml): An XML-based file format used to display geographic data in Google Earth and other GIS software.
- KMZ (.kmz), short for KML-Zipped, has become the default geospatial format in Google Earth, replacing KML. This shift occurred due to KMZ files’ compressed nature, allowing for smaller file sizes.
- MapInfo (.tab): A vector GIS file format developed by Pitney Bowes Software used for storing and managing point, line, and polygon data along with associated attribute information.
- The .DAT file (.dat) extension is a generic file format used for various purposes, including GIS applications. In GIS, .DAT files can be used to store different types of data, such as point coordinates, attribute tables, or other tabular data related to geographic features. The structure and content of .DAT files can vary depending on the specific software or application that creates or reads them.
- The .ID file format (.id) is specific to the Autodesk Infrastructure Map Server software. .ID files contain index information for spatial data, allowing for efficient retrieval and display of geographic features. They store data such as feature IDs, spatial extents, and attribute indexing information. The .ID files work in conjunction with other files, such as .SHP (shapefile) and .DBF (attribute database) files, to provide efficient spatial data management and visualization in Autodesk Infrastructure Map Server.
- The .MAP file format (.map) is associated with various GIS software, including the MapInfo Professional .MAP files serve as project files that store information about the map document, including layers, symbols, labeling settings, and spatial referencing. They do not contain the actual spatial data but act as references to the data sources, such as .TAB files or other supported file formats. .MAP files allow users to organize and manage their GIS projects, including the visualization and analysis of spatial data.
Other vector file formats:
- The .IND file extension (.ind): Commonly used in the context of shapefile indexing. Shapefiles, which consist of multiple files with extensions like .SHP, .DBF, and .SHX, can also have an accompanying .IND file. The .IND file stores index information for faster spatial data retrieval, enabling efficient access to specific features within the shapefile. The index file (.IND) is created and utilized by GIS software to optimize queries and spatial operations on shapefile data.
- OpenStreetMap (.osm): An open-source vector GIS file format used for collaborative mapping projects.
- Spatial Data Transfer Standard (.sdts): A vector GIS file format used for transferring spatial data between different systems.
- AutoCAD Drawing Files (DWG): An internal, proprietary format used in AutoCAD software, a computer-aided design/drafting (CAD) program. Can be converted to a DXF file without loss of graphic information. Lack of one standard for linking attributes can cause problems when data is transferred between systems.
- GPS Exchange Format (.gpx): is an XML-based file format used for storing GPS data. It is commonly used to share and exchange GPS waypoints, tracks, and routes between different GPS devices and software applications. GPX files can store information such as latitude, longitude, elevation, time stamps, and additional attributes associated with GPS data points.
- Geography Markup Language (.gml): is an XML-based file format used for describing and exchanging geographic data. It is an open standard maintained by the Open Geospatial Consortium (OGC). GML files can represent various types of geospatial data, including points, lines, polygons, and coverages. GML is widely used in interoperable geospatial systems, allowing data exchange between different GIS software and services. It provides a rich and standardized way to describe geographic features and their attributes in a platform-independent manner.
- Autodesk’s Data Interchange File (DXF) Format: The most widely used vector data transfer format, containing very complete display information. Several different ways to store attribute information in DXF, and no attribute standards can cause problems when importing attribute information.
- Digital Line Graphs (DLG): A transfer format used by the US Geological Survey (USGS) that depicts vector information portrayed on printed paper maps. Carries very accurate coordinate information and sophisticated feature-classification information but no other attribute data. Does not include any display information.
- Hewlett-Packard Graphic Language (HPGL): A language that controls computer plotters, containing display information but no geographic coordinates or attribute data. Not appropriate for the storage or transfer of GIS data.
- MicroStation Design Files (DGN): The internal format used by Bentley Systems Inc.’s MicroStation, a CAD program. Well documented and standardized, so it may also be used as a transfer standard. Contains detailed display information. The most common way to store attributes is to place them in an external database file and record links in the MSLINK field-a data item carried for each element in the DGN file.
- Topologically Integrated Geographic Encoding and Referencing Files (TIGER): An ASCII transfer format used by the US Census Bureau to store the street maps constructed for the 1990 census. Contains complete geographic coordinates and is line, not polygon, based. The most important attributes include street name and address information. Does not contain display information. Maps of the entire US are available in TIGER format.
- Vector Product Format (VPF): A binary format used by the US Defense Mapping Agency. Well documented and can be used as an internal format and as a transfer format. Carries geographic and attribute information but no display data. VPF files are sometimes referred to as VMAP products. The Digital Chart of the World (DCW) is published in this format.
Raster GIS file formats
Raster GIS file formats are digital image files that represent data in a grid format, where each cell or pixel in the grid has a value representing a specific attribute such as elevation, temperature, or color.
Raster data is commonly used in GIS for a range of purposes, such as remotely sensed data, shaded relief and topographic data, satellite imagery, and aerial imagery. These types of data give us amazing visual insights into the Earth’s surface, helping us analyze and explore our environment with greater clarity.
Here are some of the most common raster file formats used in GIS:
- GeoTIFF: GeoTIFF is a georeferenced Tagged Image File Format. It combines raster image data with geographic metadata, allowing spatial referencing of the image. GeoTIFF files can store satellite imagery, aerial photographs, and other geospatial raster data, making them widely used in GIS and remote sensing applications.
- JPEG: JPEG (Joint Photographic Experts Group) is a widely used lossy compression format for digital images. It is commonly used for storing photographs and other complex images. While JPEG is primarily a format for visual imagery and not specifically designed for GIS, it is often used for web-based mapping applications that require image display.
- PNG: PNG (Portable Network Graphics) is a raster image file format that supports lossless compression. It is commonly used for storing graphics with sharp edges, text, and transparent backgrounds. PNG files are often used for web graphics, icons, and images that require high-quality display with small file sizes.
- BMP: BMP (Bitmap) is an uncompressed raster image file format. It is widely supported and can store color and grayscale images. BMP files are often used for basic image storage but are less common in GIS applications due to their larger file sizes compared to compressed formats.
- GIF: GIF (Graphics Interchange Format) is a widely used raster image format that supports both lossless and lossy compression. It is commonly used for graphics, animations, and simple images with a limited color palette. GIF files can be transparent and support animation, making them suitable for web graphics and simple visualizations.
- MrSID: MrSID (Multiresolution Seamless Image Database) is a proprietary image format developed by LizardTech. It is designed for efficiently storing and displaying large raster datasets. MrSID files use a wavelet compression technique that enables high-resolution imagery with reduced file sizes, making them useful for geospatial applications.
- ECW: ECW (Enhanced Compression Wavelet) is another proprietary image format known for its high compression capabilities. It allows efficient storage and transmission of large raster datasets while maintaining good image quality. ECW files are commonly used in GIS applications where file size reduction is crucial.
- IMG/HFA: IMG/HFA (Erdas Imagine Hierarchical File Architecture) is a file format used by Erdas Imagine software for storing raster datasets. It supports various types of imagery, such as satellite imagery and aerial photographs. IMG/HFA files can store multispectral and multiband data along with metadata and spatial referencing information.
- NITF: NITF (National Imagery Transmission Format) is a standard image format used by the U.S. Department of Defense and intelligence agencies. It is designed for the exchange and dissemination of geospatial imagery and related metadata. NITF files can store both raster and vector data, along with detailed metadata and geospatial information.
- ASC: ASC (ASCII Grid) is a plain text-based raster format that stores gridded elevation or attribute data. It uses a simple structure where each cell value is represented by a numeric value in the ASCII text file. ASC files are commonly used for storing digital elevation models (DEMs) and other gridded datasets.
- Arc Digitized Raster Graphics (ADRG) – a format used by the US military to store raster images of paper maps.
- Band Interleaved by Line (BIL), Band Interleaved by Pixel (BIP), and Band Sequential (BSQ): formats produced by remote-sensing systems. The primary difference among them is the technique used to store brightness values captured simultaneously in each of several colors or spectral bands.
- DEM: Digital Elevation Model: a raster format used by the USGS to record elevation information. Unlike other raster file formats, DEM cells do not represent color brightness values, but rather the elevations of points on the earth’s surface.
- PCX: PC Paintbrush Exchange: a common raster format produced by most scanners and personal computer (PC) drawing programs.
- SDTS: Spatial Data Transfer Standard: a general-purpose format designed to transfer geographic information. One SDTS variant is the raster profile, designed as a standard format for transferring raster data. However, this protocol has not yet been finalized.
- TIFF: Tagged Image File Format: a common raster format produced by PC drawing programs and scanners, similar to PCX.
Example of raster file (Source)
Other GIS file formats
- Relational Database Management System (RDBMS) Enterprise: RDBMS enterprise file formats, such as Oracle Spatial and Microsoft SQL Server with Spatial Extensions, are specifically designed to store and manage large-scale geospatial data within a relational database management system. They provide robust data management, querying, and spatial analysis capabilities.
- LiDAR File Formats: LiDAR file formats, such as LAS and LAZ, are used for storing and managing data captured by LiDAR sensors. They store point cloud data, including 3D coordinates and additional attributes, allowing for detailed analysis and visualization of the captured environment.
- CAD File Formats: CAD file formats, such as DWG and DXF, are primarily used in computer-aided design (CAD) software. They store vector-based information related to architectural and engineering designs, including 2D and 3D geometry, annotations, and attributes.
- Elevation File Formats: Elevation file formats, such as GeoTIFF and ASC, store elevation data representing the height or depth of the Earth’s surface. They are commonly used for digital elevation models (DEMs) and terrain analysis.
- Multitemporal File Formats: Multitemporal file formats, such as NetCDF and HDF5, are used to store data that represents changes over time or different temporal snapshots. They are often used in climate modeling, remote sensing, and other applications where temporal analysis is required.
- GIS Software Project File Formats: GIS software project individual file extension formats, such as Esri ArcGIS Project (.aprx) and QGIS Project (.qgs), store project-specific information, including data sources, layers, symbology, and analysis settings. They allow users to save and share their GIS projects for future reference or collaboration.
- Cartographic File Formats: Cartographic file formats, such as Adobe Illustrator (AI) and Scalable Vector Graphics (SVG), store map layouts and graphical elements used for cartographic design. They are native data format and are used to create visually appealing maps with various annotation and design elements.
- 3D File Formats: 3D file formats, such as COLLADA (DAE) and 3D PDF, store three-dimensional geometric data along with attributes. They are commonly used in GIS applications that require the visualization and analysis of 3D data, export files such as 3D terrain models or building models.
- Indoor Mapping File Formats: Indoor mapping file formats, such as IndoorGML and IndoorJSON, are specifically designed to represent indoor spaces, including floor plans, room layouts, and navigation information. They enable the visualization and analysis of indoor environments
Related Posts
Thank you for taking the time to read our blog post!