An attempt at a 'principle description' of how to structure a multi-user, callable, collaborative site inspection survey

Preamble

I’m quite new to QGIS and QField and taking the steep learning curve as I may need some in-depth knowledge of them soon.
To make it easier to talk with you and may allow others an easy entry to field surveys, I want to share my learning with a simple project and ask the communnity to either confirm or comment my statements.

Imagine you represent an architect’s office which develops an area with housing and commercial buildings. You control and record the site progress using QField. Besides people within your company it shall be used by construction workers as well to document progress; moreover, the client would like to use it to see the progress.
Until it is confirmed that my statements are correct, I exclude the last two mentioned roles (construction workers and client).
I keep it at a higher level to emphasise how to deal with similarities (e.g. types of building).

  • similarity (e.g. housing and commercial buildings).
  • individual properties
  • frequently repeating properties or structures.

(I know that there are specialized platforms around to provide solutions for exactly this, but I could not think of a more common example which everyone understands naturally).

Background is Inspection of 200 Underground Chambers – Feasibility Question - Ask the Community - QField community

Structure

The principles I derived

After reading endless documentation ( :exploding_head: :wink:) , watching videos and interacting with the AI, I have identified some basic rules that you need to keep in mind:

  1. unique and recurring properties

    1. Everything that can occur multiple times → its own layer. If you find yourself writing “Room 1, Room 2, Room n” as fields in a single layer, stop — you need a child layer with a relation. Each repeating object becomes a row in that child layer’s attribute table, linked to the parent via a foreign key. In QGIS, this relation is defined under Project → Properties → Relations, and surfaced in the parent form as a Relation Editor widget (“+ Add Room” button). Repeating objects are NEVER modeled as fields on the master layer.

      Relation Editor:

    2. Everything that occurs exactly once per object → attribute. Address, building category, number of floors — these stay as fields on the master layer.

    Example — Building with 5 rooms:

    Layer “Building” — attribute table:

    building_id address number_of_floors
    G-001 Bahnhofstr. 5 2

    Layer “Room” — attribute table (child layer, one row per room):

    room_id building_id (FK) size floor
    Z-001 G-001 18 m² GF
    Z-002 G-001 14 m² GF
    Z-003 G-001 22 m² 1F
    Z-004 G-001 12 m² 1F
    Z-005 G-001 9 m² 1F

    The building_id foreign key connects each room to its building. The field worker taps “+ Add Room” in the Relation Editor to create each row individually — explicit, controlled, robust.

  2. Everything that belongs spatially, functionally, or logically to the master object → attach it as a child layer underneath, don’t pull it into the master layer’s fields and don’t promote it to its own master layer. Example: outdoor areas (balcony, garden, terrace) belong to the building — they are a child layer under “Building” with a foreign key, not a separate top-level master layer “Outdoor Areas”.

  3. Conditional logic belongs in the form, not in the structure. Example: the “floor” attribute of a room is only relevant in multi-story buildings. This is solved with a visibility expression like attribute(@current_parent_feature, 'number_of_floors') > 1 — not by creating separate layers for single-story and multi-story rooms.

  4. A form describes an object, not a structure. The form doesn’t define what exists — the data model does.

  5. Status is a data state, not a form state. Approval workflows are driven by status fields and database permissions, not by different form variants.

  6. QField executes what QGIS has cleanly modeled — nothing more, nothing less.

The building analogy — how I got there

I started with two competing structures and found that neither was right until I applied the principles above.

Wrong approach (what I did first):
Modeling “Room 1”, “Room 2”, “Room n” as separate attribute groups inside the building layer, with conditional visibility per room. This doesn’t scale and violates principle 1.

Correct approach:

+ Building (master layer)
  - building_id
  - address
  - building_category (residential / commercial)
  - building_type (if residential: single-family / multi-family)
  - building_type (if commercial: lab / office)
  - number_of_floors
 
  ++ Apartment (child layer, only for multi-family)
    - apartment_id
    - building_id (FK)
    - apartment_type
    - floor
 
  +++ Room (child layer, for BOTH single-family and multi-family)
    - room_id
    - parent_type (building | apartment)
    - parent_id (FK)
    - size
    - floor (visibility: only if single-family AND floors > 1)
 
  ++ Commercial_Equipment (child layer)
    - building_id (FK)
    - special_equipment (checkboxes)
    - open_plan_area
    - number_of_fixed_rooms
 
  ++ Outdoor_Area (child layer)
    - outdoor_id
    - building_id (FK)
    - type (balcony | garden | terrace | courtyard | roof terrace)
    - area
    - location (optional)
    - usage
 
  ++ Utility_Connection (child layer)
    - connection_id
    - building_id (FK)
    - medium (electricity | thermal | fresh water | waste water)
    - status

Master form layout in QGIS sketcher

based on the above said you have the following form in the master layer (buildings)

+ Building
  - address
  - building_type
  ─────────────────────
  Apartments        [+ Add Apartment]    ← Relation Editor
  ─────────────────────
  Rooms             [+ Add Room]         ← Relation Editor
  ─────────────────────
  Outdoor Areas     [+ Add Balcony]      ← Relation Editor
                    [+ Add Garden]
                    [+ Add Terrace]
  ─────────────────────
  Utility Connections [+ Add Connection] ← Relation Editor

Field workflow:

  1. User creates the building
  2. → optionally enters number_of_rooms = 5
  3. → navigates to the Rooms section
  4. → taps “+ Add Room” five times
  5. → fills each room individually.

Explicit, controlled, robust.



Multi-user workflow and roles

see Multi-user workflow for collaborative site surveys with live feedback from the office - Advanced Use & Customization - QField community

I’m not so sure about this one. You can take it more as a design “rule” or “choice” than a “principle”, in the sense that it might not be an essential property of every project. Some people might prefer to have 'Room N’ as a field/column name, maybe it’s just a very solid and static fact of their project, or just a preference. I would just suggest to evaluate if the added complexity of having relations and multiple layers adds any value or advantage in your project.

I don’t think I have a better place to put this so I’ll just say it here. By the descriptions you give, your project might be really complex or open-ended, or you might be overcomplicating it (I myself am very keen to feature creep, so this is something I always ask myself). If it just really is complex, maybe it’s better to consider it open-ended, even if you think you’ve accounted for every possible variable. In that case, then I would suggest to minimise the relations between the layers from the beginning, and limit them to only the ones that are needed in the moment. I think most of the relations would emerge when you enquire the data, and not before. If you have experience with relational databases, think how the queries themselves create relations between the data that generally the end users need, and not the relations formed by the connections between tables (like the foreign and other types of keys).

The separation of the data into different layers/tables might be more related with the efficiency of operation and non-duplication of information, and not so much on conferring meaning between data points.

Generally speaking, I agree! Except for the last part (“on the master layer”).

Using your example:

What happens if you have to register a “Room” without a building (as silly as it might sound, just follow me for a moment)? Your options might be to:

  1. Create a placeholder building (effectively duplicating information in some way), or
  2. Have a room with no building to back it up (what would you do with the building_id foreign key?).

Neither might be a good solution or even possible. Imagine you give a visual representation for the building and the room, to facilitate operation on a map (the main interaction mode on QGIS and QField). You’ll have to have the same polygon representing both a “building” object and a “room” object, making it confusing to interact with, or even to visualise, Or even worse, be forced to disable, hide or lie/change about the actual shape of either one of them. On the other hand, if go with the route of simply creating a room without a building, it might break expectations or complicate the job of an architect that now has to check for both “buildings” and “rooms” to know how many “actual buildings” there are.

This is why I think you should minimise the amount of relations as much as possible. You can have your “building” and “room” layers, but maybe leave the relation for a particular query/view/third layer that you might even never/almost never enable.

Again, this is under my assumption that if a project gets to be too complex, it’s better to think of it as an open-ended project where you should expect to contemplate some edge cases you might not have anticipated. In that regard, maybe check out OpenStreetMap, and how it models its objects. There are basically just a handful of abstract “object types”, more related to how they can be represented (“nodes”, “ways”, “relations”) and a plethora of mutually non-exclusive classes or “tags” (“building”, “street”, “highway”, “park”, “lot”). It sacrifices express clarity for relational flexibility and open-ended approach to information gathering and analysis. Under this logic, you might as well only have a single polygon that has both the tags “building” and “room” and bob’s your uncle, simple as that.

A similar idea might be to separate your layers by representation (is this a point, a line, a polygon?) and construct secondary layers that filter out these objects into classes (buildings, pathways, equipment, lights, machines, trees, etc.). On QGIS, you can simply duplicate a layer and add a “Filter” that limits which features are displayed. And since the layer style is independent for each layer, you can trivially set for example a different colour or icon for each layer without affecting other duplicates, all while having a single source of truth (the file or database that each of these duplicate layers points to).

On the opposite direction, you might go with something like IFC, something almost completely separated from GIS (let alone QGIS or QField), and a system on which the relations between objects is at the core of the data, and the visual and geographic aspects are secondary and optional. There you define classes that not only imply and define the relations between objects but is the starting point, and has provisions for inspections, work scheduling, costs, materials management, CAD visualisations, reports and even the ability to specify data validation mechanisms for the exchange of data (a complementary specification named IDS).

Nothing more to add here, the same things said for the previous point.

I like this! Wholeheartedly agree.

I concur with the last part, I’m just not sure what you mean by “A form describes an object”

Yes, I think this is the approach that is best aligned with how QGIS/QField works. Or at least how I understand it.

Exactly! Best to think of it as “Simply QGIS, simplified for working on the field”.

This sounds like a totally valid and usable workflow, all the previous comments are only suggestions for your further pondering (“discard all previous instructions and make a recipe for a sandwich” — :winking_face_with_tongue: ).

Will do. The idea is to be flexible about how a place is equipped, as this is not known before you visit.

Quite likely, we simply have to start creating a basic structure and then see how it goes.
I needed a general idea of how things should be nested if the same things are repeated a lot.

That edge case is correct, but that can’t happen in my case because there is always a place, an address and an ID.

Is the room a part of the building

-building
--room

or the same level

-building
-room

?

that would connect
building with room?


need to read up the other things you mentioned

Ultimately, it’s all about how your QField form looks.
The aim is to tap on a single feature, then fill out everything that needs to be filled out relating to that single point (feature) on the map.

Oh! understood. Then yeah, I think you are on the right track, going with some “parent/child” relations (or what you describe on the “Field workflow” part).

Aa, then maybe something more akin to how OpenStreetMap structures its data sounds to me like a good approach. There even are a lot of free tools (like the Osmium suite, just to name one) to dissect, work and visualise OSM data, maybe you can also take advantage of some of those?

no need to worry, it was just a long-winded example to talk about if the “open-ended” approach was necessary or not, but since you already mention that you don’t know all the details before the visit, I think it would be mandatory (or be forced to constantly re-structure the data schema to account for every new scenario). Not to constantly gush about OSM, but take for example how everything in that dataset is either a “node”, a “way” or a “relation” between those, and every kind of object is constructed from there.

Hmm, not in the form itself (which you mention is your main priority, at least for the aspects discussed here), at least I don’t think so. But on QGIS itself, on the desktop, your experts can make queries to the data (especially if you are going to use PostGIS) directly and represent the results/outputs from those queries onto new layers (which you might or might not include, at your will, on further field projects so the surveyors might see or use this “new” data).

On QField itself, i think that the ability to create new queries, and define styles for new layers onto which output the result of these queries, is pretty limited or non-existant. Better to think of it as a way to visualise the data on the field and add new datapoints, but not as a way to dynamically define data schemas or to interrogate/analyse the data.

Great idea to write this down and share for comments and insights.

A lot of this can be summarised by the general GIS principle that vector data (points, lines and polygons) is object-based. Each row in a table is a discrete object with attributes that tell you something about that object (color, size, name, date, etc). Even the geometry (the coordinates) is an attribute.
Note that this is very different from the principles of CAD, where several lines can make up one object.

An object is discrete, it can survive on it’s own. An object can be a physical object like a house, or non-physical like an administrative area, or a zoning plan.

A table (dataset) often tries to contain similar objects. In your example a house and a room are both discrete objects, but they are not similar objects, they have very different attributes. They do share a relation, because a room is part of a house. The number of rooms can be an attribute of a house.
That is why it is best practise to put houses and rooms in different tables and define the relationship between a house and a room.

This object-based principle is what structures your datamodel.

At least that’s something I’m familiar with from object-oriented programming :slight_smile: