A Software Architect’s Exploration of Open Source GIS Software and SDI-Considerations for GIS Application
Designing Metadata and Schema Structure—setting up the structure of your geodatabase, defining how different datasets will interact, and establishing standards for metadata. This step ensures data organization, consistency, and accessibility throughout the GIS application
The first step involves designing the metadata and schema structure, which entails setting up the structure of the geodatabase—a database designed to store, query, and manipulate geographic information and spatial data. It's a type of database that integrates geographic data (spatial data), attribute data, and spatial relationships between different datasets, providing a more robust framework for managing spatial data compared to traditional flat file storage methods, defining how the different datasets will interact, and establishing standards for metadata.
Following this, you must establish metadata standards, with INSPIRE and ISO 19115 commonly used for geospatial data
After deciding on the standards, creating metadata templates based on the chosen standards becomes crucial. Documenting your data collection and entry procedures also forms a significant part of the workflow. The last part of this stage involves implementing data quality controls by establishing procedures to check your data's quality
The output of this process is a structured geodatabase, comprehensive metadata standards, and data collection and quality control procedures.
Data Collection and Digitization: In this step, you will collect spatial and non-spatial data from various sources. The accuracy and comprehensiveness of your data collection efforts are essential for accurate property assessment and tax calculations. Highlight the need for thorough planning and testing within the workflow to address challenges that may arise when integrating different open-source software components.
Identify Data Sources: Identify the sources from which you will collect data. This can include local government databases, open-source data platforms, public records, surveys, or field data collection.
Collect Spatial Data: Collect the necessary spatial data for property assessment. This may involve acquiring parcel boundaries, future development plans, satellite images, topographic data, land use data, road networks, environmental data, and other relevant spatial information. QGIS is commonly used for handling and processing spatial data. It provides various data collection, analysis, and validation tools.
Collect Non-Spatial Data: Collect the non-spatial data pertinent to property assessment. This can include information such as structural characteristics of properties, rental history, ownership details, sale history, tax records, local market data, zoning regulations, environmental efficiency ratings, disaster history, crime statistics, liens and judgments, and other relevant data. This is a foundational step. If you have physical paper records, you must manually input this data or use scanning and Optical Character Recognition (OCR) technology to digitize the records. Check the digitized data for any errors or discrepancies. Data in Excel or other digital formats can be compiled for easier processing in the next steps. For OCR, you can use tools like Adobe Acrobat or Tesseract. You can use Excel or a similar spreadsheet tool to organize digital data. Tools like spreadsheets or databases (e.g., Microsoft Excel or PostgreSQL) can also be used for managing and cleaning non-spatial data.
Define the Data Structure: Abiding by ISO 19107 - Geographic information - Spatial schema, the first step is to define the structure of your geodatabase. To do this, you must fully understand your existing data, which can be achieved through schematic mapping. This mapping should visually represent how your current data is structured and highlight any potential areas of inefficiency or redundancy. This step also involves generating periodic reports, which help monitor the state of your data and provide insight into how it's changing over time, facilitating iterative improvements in your data structure. Also, this stage includes a schematic mapping of the scheme of service of selected institutions. By understanding these services, you can identify gaps in your data or potential new data sources. By identifying these, you're not only defining but also refining the structure of your geodatabase. The open-source tool for schematic mapping in this step is QGIS. The open-source relational database PostgreSQL and the geospatial extension PostGIS are the chosen tools for defining the database structure.
Establish Metadata Standards: Leveraging ISO 19115 - Geographic Information - Metadata, decide on the metadata standards to use. These standards will guide the formatting and classification of your geospatial data. The choice of standards should also consider the schematic mapping of the scheme of service of relevant institutions. Understanding these various service schemes gives you a clear view of what metadata standards would best suit your needs and ensure effective communication and data interoperability with these institutions. GeoNetwork, a comprehensive catalog system for managing spatially referenced resources, is an open-source tool for managing metadata by established standards.
Create Metadata Templates: Utilizing ISO 19110 - Geographic information - Methodology for feature cataloging, create metadata templates that adhere to your chosen standards. Part of this step involves developing a conceptual schema for digital information dashboards. This conceptual schema acts as a blueprint, showing how data will be organized and presented on your dashboards. This organization must align with the metadata templates, ensuring that the data is well-structured and can be effectively visualized for end users. Metatools, an extension of QGIS, is the open-source tool recommended for creating and managing metadata templates in this step. For designing digital dashboards based on metadata templates, Apache Superset, an open-source data exploration and visualization platform, is to be used.
Document Data Collection Procedures: Following ISO 19157 - Geographic information - Data quality, document your data collection and entry procedures. By understanding the scheme of service of relevant institutions, you can tailor your data collection procedures to harmonize with these services. These documented procedures should be collated into a comprehensive handbook describing your approach to digitizing non-digital data. This handbook serves as a go-to resource, ensuring consistency in data handling practices and offering clear guidance to all involved in the data collection process. LibreOffice Writer, an open-source word processor, will create comprehensive handbooks and document data collection procedures. The open-source mobile app, Input, based on QGIS, is recommended for systematic data collection as per established procedures.
Implement Data Quality Controls: Implement robust data quality control measures based on ISO 19138 - Geographic Information - Data Quality Measures. This involves defining a quality assurance approach to digitalizing existing non-digital data, ensuring the digitized data retains its accuracy and reliability. Moreover, these measures should extend to checking data resulting from periodic reports, contributing to an ongoing effort to maintain the integrity of the data in your geodatabase. Furthermore, organize hands-on training sessions for your team to understand and execute these quality control procedures effectively. This not only increases the proficiency of your team but also helps in maintaining consistent, high-quality data across your geodatabase. QGIS, with its built-in functionalities for data cleaning and quality control, is the tool of choice in this step. Moodle, a widely used open-source learning management system, is the chosen tool for creating and managing hands-on training modules to ensure the effective implementation of quality control procedures.
Do: Take the time to plan and design the geodatabase structure carefully. Consider scalability and flexibility to accommodate future data updates and changes. Invest time and resources in designing a well-structured geodatabase and establishing clear, comprehensive metadata standards. This will save you a lot of trouble down the line.
Don't: Neglect the importance of metadata. Well-documented metadata improves data discoverability, ensures data integrity, and facilitates data sharing. Don't neglect data quality controls. Regular checks for data quality are crucial for maintaining the accuracy and reliability of your geodatabase.
Output: A well-defined data structure, established metadata standards, metadata templates, documented data collection procedures, and implemented data quality controls. These outputs will be the foundation for effectively organizing and managing the GIS data.
Don’t forget:
Step 1.1: Define Data Structure—define the structure of your geodatabase. Determine what tables you need, their attributes, and how they relate. Ensure your structure will support all the data you need for your property assessments.
Step 1.2: Establish Metadata Standards—decide on the metadata standards you will use. INSPIRE and ISO 19115 are commonly used standards for geospatial data. Your metadata should include information about when and how the data was collected, who collected it, what geographic area it covers, and any restrictions on its use.
Step 1.3: Create Metadata Templates—create templates for your metadata based on your chosen standards. This will ensure consistency across all your datasets.
Step 1.4: Document Data Collection Procedures—document your data collection and entry procedures. This will help ensure your data is collected and entered consistently and accurately.
Step 1.5: Implement Data Quality Controls—establish procedures for checking the quality of your data. This could involve regular checks for errors in data entry, consistency across different datasets, and checks to ensure that your data matches reality.
Data Cleaning and Validation: Conduct data cleaning and validation procedures after collecting data. Remove duplicate entries, correct errors, and fill in missing values. Validate the accuracy and integrity of the data by cross-referencing different data sources and conducting quality control checks.
Do: Ensure data accuracy and completeness through thorough data collection efforts. Validate the collected data against reliable sources and employ quality control measures. Don't: Overlook the importance of data cleaning and validation. Inaccurate or incomplete data can lead to erroneous assessments and calculations.
Don't: Rely solely on a single data source. Cross-reference and validate data from multiple sources to ensure accuracy and completeness. Using diverse and reliable data sources enhances the reliability of your property assessments and tax calculations.
Output: A comprehensive collection of spatial and non-spatial data relevant to property assessment and tax calculations. The collected data will serve as the basis for subsequent analysis and processing steps in the GIS application.
Geodatabase Creation and Data Import—Integration of Spatial and Non-Spatial Datasets within the Geodatabase: In this step, you will import both the spatial and non-spatial data into the Geodatabase and establish relationships between different datasets. Data integration allows for efficient management and analysis of the collected data. PostgreSQL and PostGIS are commonly used for data integration. PostgreSQL provides a robust relational database management system, while PostGIS adds spatial capabilities to handle spatial data effectively. Highlight the importance of a well-structured workflow for maintaining quality control, reproducibility, and managing updates to open-source software without disruption.
Import Spatial Data: Use PostgreSQL and PostGIS to import the spatial data into the geodatabase. This includes importing digitized parcel boundaries, satellite images, topographic data, road networks, environmental data, and other relevant spatial datasets. Create appropriate tables and define the necessary attributes to store the spatial data.
Import Non-Spatial Data: Import non-spatial data, such as property characteristics, ownership details, rental history, tax records, and other relevant information. Use PostgreSQL to create tables and define the appropriate fields to store the non-spatial data.
Establish Relationships: Identify the relationships between different datasets within the geodatabase. Establish primary and foreign key relationships to link the spatial and non-spatial data tables. This will enable efficient querying and analysis of the integrated data.
Property Assessment: In this step, you will conduct property assessments based on the available data and methodologies. Property assessment involves determining the market value of properties based on various factors such as size, condition, location, and comparable sales. You can use QGIS, PostgreSQL, and spreadsheets for data analysis and calculations to conduct property assessments. These tools provide functionality for organizing and analyzing property data, performing calculations, and recording assessment results.
Firstly, it's crucial to thoroughly understand the objectives of the Web/Mobile GIS application, the target audience, the type of data you'll be working with, and the specific features the end-users will need. Do they need to be able to interact with the data? If so, how? Should they be able to filter or search the data? Answering these questions upfront will help guide your design and development process. Don't rush this process or make assumptions about what the end users need. Communicate effectively with all stakeholders and consider getting their feedback at multiple stages throughout the project. There are really three building blocks to realizing a well-designed application: Block A—Setting Up the Geospatial Server: A geospatial server such as GeoServer or MapServer is needed to serve your geospatial data over the web, and finally, Block B—Designing the Web GIS Application: This phase involves planning and sketching the layout, functionalities, and overall user experience of the web or mobile application and creating a client-side Interactive Map Interface: This would require software like Leaflet, which is a powerful open-source JavaScript library for mobile-friendly interactive maps.
Installation of GeoServer: A GeoServer is a Java-based software server that allows users to view and edit geospatial data. Using open standards set by the Open Geospatial Consortium (OGC), GeoServer allows great flexibility in map creation and data sharing. GeoServer's capabilities extend to on-the-fly rendering of geospatial data to images for visualization on a map, making it easier to depict intricate geographic data within your application. Moreover, it's designed for performance and scalability, which makes it well-suited for high-demand scenarios, offering options for caching and clustering. While it's possible to create a web or mobile app without a GeoServer if your app requires interaction with intricate geospatial data, geographic queries, or rendering geospatial data on a map, it can simplify and optimize these tasks. There are several open-source geospatial servers available. We discussed them above. GeoServer is perhaps the most popular open-source server designed to serve geospatial data. MapServer is another popular open-source platform for publishing spatial data and interactive mapping applications to the web. It's known for its speed and reliability. We can go with PostGIS, which, while technically an extension to the PostgreSQL database, PostGIS adds geospatial capabilities to the database, effectively allowing it to serve as a geospatial data server. It's particularly known for its powerful spatial database capabilities. To start with the installation, first, ensure that you have Java installed on your system. GeoServer requires a Java Runtime Environment (JRE), and it's recommended to use a version that GeoServer officially supports. Download and install GeoServer from the official website. The installation process is relatively straightforward, with installers available for various operating systems. After installation, you can start GeoServer by running the startup script that came with the installation. This will launch GeoServer, which by default runs on port 8080. By navigating to http://localhost:8080/geoserver, you can access the GeoServer web admin interface to start managing your geospatial data.