Clash Detection: It's Big Data

At BuildingSP, we've been working hard on two fronts – making better clash detection tools and creating tools that avoid clashes altogether. The drive to do this is our belief that clash detection is a process that takes too long and that it's a fundamentally broken workflow. We've come to the following conclusion:

Clash detection is a big data problem.

The purpose of this post is to discuss what we mean by this and describe what we're doing about it. Wikipedia defines big data as this:

Big data is a term for data sets that are so large or complex that traditional data processing application softwares are inadequate to deal with them.

A deeper definition says that big data is characterized by five Vs:

  • Volume: There's a large volume of data.
  • Variety: The data has variety to it.
  • Velocity: There's a constant stream of data.
  • Variability: The data has some inconsistency.
  • Veracity: The quality of the data can vary.

Clash detection checks all the boxes for the five Vs. The volume of clashes on a project in a given week defies our ability to manage and characterize it. The clash information has variety to it because it comes from multiple sources, teams, and systems. The data has a high velocity because it changes every week. The files can be inconsistent with variability, and the quality of the data varies depending on who modeled it and when it was modeled. Clash detection involves a huge amount of data that is a great big moving target, which often defies our ability to manage it.

Not all clashes are important. We need to focus only on the clashes that affect costs, design, and schedule.

Big data problems require big data approaches. Here are three ways that our work with ClashMEP improves clash detection through big data approaches:

  1. Revit-Centric
  2. ClashMEP improves Revit. Without ClashMEP, a user needs to export their files to another platform like Navisworks, which takes time. By bringing clash detection to Revit, the process of detecting clashes occurs in real time and is interactive – if you model a clash, it shows you, and if you clear a clash, it goes away. Bringing the focus back to Revit simply improves the speed of clash detection.

    More importantly, ClashMEP uses Revit to leverage several big data concepts – connection, context, and community. Revit provides the connection through linked files, connecting groups of data. Focusing on the user experience and where the modeling is occurring creates context, which narrows and naturally filters the total number of clashes and data. The built-in, shared environment from Revit, whether it's with central models, Revit Server, or Collaboration for Revit, creates a collaborative community and shared environment. ClashMEP leverages Revit to deal with our big data problem.

  3. Parallel, Not Serial Processes
  4. James Benham of JBKnowledge first identified ClashMEP as a parallel process that is superior to our current serial process. (We were calling this improvement an example of continuous versus batch processing, which is accurate, but James said it better. See for yourself on this YouTube stream of ConTechTrio's podcast.) The difference is who performs the clash detection. Nearly all projects rely on a BIM coordinator to proctor the clash detection process. The project BIM coordinator aggregates files, runs clash tests, and manages meetings for clash resolution. ClashMEP changes this process by giving the whole team the ability to identify clashes in a continuous way. Clashes are identified by everyone rather than one individual. Parallel processes are always faster than serial processes. (This doesn't mean BIM coordinators are going away; their job description will just change. We'll talk about that in another post.)

  5. Machine Learning
  6. In our current clash detection workflows, we use a process called "sampling." Creating clash tests in Navisworks is sampling because we're selectively asking for batches of clashes. "Big data doesn't sample; it just observes and tracks what happens." ClashMEP allows us to observe and track all the clashes that occur, as they happen. How then do we deal with the immense amount of data we get about clashes?

    The answer is we use the big data technique of machine learning. ClashMEP is integrated into the modeling software, which gives us access to an immense amount of metadata – who modeled the clash, what time they modeled it, what files they were using when they were modeling, and all the system information about the clash. In addition, we've created analysis algorithms with the help of the Mechanical Contracting Education and Research Foundation (MCERF), which is part of the Mechanical Contractors Association of America (MCAA). The combination of these analysis algorithms and metadata allows us to programmatically prioritize clashes using machine learning. This means that machine learning will tell us which clashes are the hardest to resolve with the biggest project impact, so project teams can appropriately focus their efforts. We'll be rolling out these machine learning techniques as we transition ClashMEP to a project-wide solution in the next couple of months.


    What does all this mean? If you're working in clash detection and MEP coordination, you're working in a big data environment. If you're in a big data environment, you need efficient big data tools. ClashMEP is Revit-centric, a superior workflow, and will soon be supercharged with machine learning. We encourage you to become a customer and work with us to change our industry through innovation. Start today by downloading a trial of ClashMEP from our website.

Latest ClashMEP Demonstration Videos

Clash Filtering

File-vs-File Clash Testing

Webinar with A2K Technologies

Drop us your email address and stay connected with us!