New York City, [April 5th, 2024] - Zoe is a quality inspector at a natural gas power plant. She faces a recurring problem: As she inspects the facility, she spots a corroded pipe fitting that needs immediate attention. Unfortunately, the only way to get this fixed is to fill out a paper form with the proper details—a process characterized by miscommunication and errors. In Zoe's experience, incorrect fittings are often ordered due to misread forms, and even when the right fitting arrives, locating the specific pipe in the plant's labyrinth becomes another challenge. She wishes there was a more efficient way to communicate these issues. Zoe imagines an ideal solution: a device that follows her around, transcribing and summarizing her observations for the maintenance and purchasing departments. But that’s surely science fiction…

Why now?

The limitations of traditional reporting solutions in industrial environments like Zoe's are evident. Paper forms and slow and error-prone, mobility devices are hard to use while wearing safety equipment like heavy gloves, or have restrictions requiring the purchase of expensive intrinsically safe devices. This is where ambient computing devices come into play, offering seamless, unobtrusive computing services.

Key components of these advanced solutions include:

Ambient Computing Devices: Devices like smart speakers or smart watches integrate into the user's environment, offering on-demand services like voice commands and audio/video recording.
Large Language Models (LLM): These models are capable of interpreting complex requests accurately.
Computer Vision (CV): CV excels in identifying objects and actions within an image, and when combined with LLMs, it can create comprehensive work records by matching video frames with corresponding audio.

How It Works in Practice

A worker like Zoe can speak to an ambient computing device, which records audio, and, optionally, video. The system then automatically creates a chapterized video with bookmarks for each distinct section. It can also pre-fill a form with a detailed summary and relevant stills from the video. The LLM can correctly interpret the worker's final intent, even if they correct themselves or change their mind mid-recording.

Upon returning to the office, the worker can review and submit the form for processing, saving valuable time and remembering all the first-hand context of the on-site situation. The form's recipient receives a rich, multi-modal record of the event. They can quickly grasp the situation from the high-level summary and delve into the raw data for a comprehensive understanding of the worker's observations and actions.

How Datch solves this

In environments like Zoe's, paper workflows have many problems and traditional digital solutions pose risks or practical challenges. By integrating ambient computing devices, AI technologies offer a compelling new approach. These tools not only enhance the accuracy and efficiency of information capture and reporting but also ensure safety and adaptability in complex industrial settings.

Datch’s ambient computing solutions can be adapted to various situations to meet specific company needs.

Want to learn more? Let us know below

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.