Enhancing Equipment Rigging Training through Augmented Reality/Computer Vision (AR/CV) Guidance

Copyright 2024 Terasynth, Inc. All rights reserved. This document is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0): http://creativecommons.org/licenses/by-nc-nd/4.0. For licensing information contact our general mailbox at https://linkedin.com/company/terasynth.

Company: Terasynth 

Solicitation Number: A244-067 

Principal Investigator: Ali Mahvan 

Business Official: Ali Mahvan, CEO

Submission Date: October 1, 2024


Volume 2A: Feasibility Documentation

Technical Approach

Our core technical approach leverages the synergistic capabilities of Augmented Reality (AR) and Computer Vision (CV) to deliver an interactive, real-time guidance system for Soldiers engaged in equipment rigging tasks. The AR component will dynamically overlay digital instructions, 3D models, and visual cues directly onto the Soldier's field of view, providing an intuitive and contextually relevant instructional layer. The CV component will employ state-of-the-art algorithms for object recognition, tracking, and pose estimation to monitor the Soldier's actions in real-time, ensuring procedural adherence and enabling the system to provide immediate, actionable feedback.

This approach is firmly rooted in established research and successful implementations of AR/CV technologies across various domains. Studies have consistently demonstrated the efficacy of AR in enhancing training outcomes by improving engagement, knowledge retention, and task performance. Furthermore, advancements in CV, particularly in object recognition, tracking, and pose estimation, have enabled real-time analysis of complex tasks, even in dynamic and unpredictable environments. RedShred's extensive experience in developing the COACH system for DARPA, which performs real-time work tracking and procedural task guidance using computer vision, directly informs and validates the technical feasibility of this approach.

One of the primary challenges in applying CV to equipment rigging lies in the deformable nature of cloth materials, such as parachutes and straps. Traditional CV algorithms often struggle to accurately track and assess tasks involving objects that undergo significant shape changes. To address this, our approach will employ a multi-pronged strategy:

  • Key-point Tracking: We will implement robust key-point tracking algorithms to identify and track salient features on the cloth materials. This will enable the system to maintain accurate object recognition and pose estimation even as the materials deform during the rigging process.

  • Task Completion States: The system will be designed to prioritize the recognition of easily identifiable task completion states, such as the proper configuration of straps or the secure attachment of equipment. This will reduce reliance on continuous tracking of deformable objects and enable more robust task assessment.

  • Machine Learning: We will leverage advanced machine learning techniques, particularly deep learning, to train our CV models on a large and diverse dataset of rigging scenarios. This dataset will encompass various lighting conditions, equipment configurations, and deformation patterns, ensuring that the system can generalize and perform accurately in real-world conditions. RedShred's expertise in integrating and adapting to new machine learning models, as demonstrated in the COACH system, will be instrumental in continuously refining and improving the CV capabilities of the proposed solution.

By combining these strategies, we are confident that our AR/CV system will overcome the challenges associated with deformable materials and provide Soldiers with a reliable, real-time guidance and feedback system during equipment rigging tasks, thereby significantly enhancing training effectiveness and operational efficiency.

Results & Achievements

Prior work by RedShred and Terasynth has yielded tangible results that directly demonstrate the Phase I equivalent and feasibility of the proposed AR/CV-based rigging guidance system.

  • RedShred's COACH System: In the development of the COACH system for DARPA, RedShred achieved real-time work tracking and procedural task guidance with high accuracy, successfully converting technical orders into actionable steps for maintainers. The system's computer vision algorithms demonstrated robust object recognition and tracking capabilities, even in challenging environments with varying lighting conditions and occlusions. 

User studies conducted during the COACH project showcased significant improvements in task performance and knowledge retention compared to traditional training methods, validating the effectiveness of AR/CV guidance in complex, real-world scenarios. The ability to automatically generate manuals and records/logs from video, as demonstrated in COACH, can be extended to capture and analyze soldier performance during rigging tasks, providing valuable insights for training improvement.

RedShred is a documents-as-a-database platform for XR data readiness and digital transformation. RedShred combines the power of computer vision, natural language processing and natural language understanding in an API-first development environment to structure unstructured data. As an API-first, SaaS platform, RedShred empowers developers and data scientists to tailor the way they interact with document-hosted knowledge to build smarter applications. 

RedShred goes beyond extracting text and images by seeing pages like a human and finding the right content through AI document understanding. RedShred understands technical data that is sourced from flowcharts, fault isolations, diagrams, warnings, tables, and infographics like line and bar charts. Once ingested, the extracted information is enriched using machine learning and user-customizable plugins to provide a central content hub for building smart applications. The figures below are examples of how RedShred sees various elements of technical data and colors what it recognizes. RedShred identifies segments: the colored regions of various text and non-text instructions. The content inside of these segments is enriched using user-customizable machine learning models and plugins to extract structured metadata like text, entities of interest (e.g. parts, tools) or objects identified in pictures. This detailed information in the segments is displayed to the content SME so they can rapidly find and search. For the data scientist or developers, this information is returned as structured data from API calls to use in their specific applications.

  • Terasynth's AR UX/UI & AI/ML LMS Generation Prototypes : Terasynth has consistently demonstrated its proficiency in crafting intuitive and effective AR user interfaces (UIs) and user experiences (UXs) that seamlessly blend digital information with the real world. These interfaces prioritize clarity, minimizing cognitive load, and maximizing situational awareness, crucial factors in high-stress, mission-critical environments like equipment rigging. Terasynth's AR UIs are designed to present information in a clear and concise manner, utilizing visual cues, 3D models, and real-time feedback to guide users through complex tasks with minimal distraction. This focus on user-centric design ensures that Soldiers can interact with the AR system effortlessly, allowing them to focus on the task at hand.

Furthermore, Terasynth has developed advanced AI/ML-powered Learning Management Systems (LMSs) that can generate personalized training content and adapt to individual learning styles. These LMSs leverage machine learning algorithms to analyze user performance data, identify knowledge gaps, and provide targeted feedback and remediation. This capability is essential for ensuring that Soldiers receive the training they need to master equipment rigging procedures, regardless of their prior experience or skill level. By incorporating AI/ML-driven LMS generation into the AR/CV rigging guidance system, Terasynth can create a comprehensive training solution that not only guides Soldiers through tasks but also facilitates continuous learning and skill development. Their work on projects which involves bringing "virtual TLOs" into VR experiences, directly translates to the creation of an AR-based rigging guidance system.

  • Combined Achievements: The combined expertise of RedShred and Terasynth has resulted in the successful integration of document understanding, real-time task guidance, and AR/CV technologies. This integration has led to the development of prototype systems that can accurately interpret complex technical documentation, provide step-by-step visual guidance, and assess task completion in real-time. These achievements directly align with the Army's objectives for this project, demonstrating the feasibility of developing an AR/CV-based rigging guidance system that can significantly enhance Soldier training and operational effectiveness. RedShred's ability to extract detailed task graphs from technical orders, coupled with Terasynth's proficiency in creating interactive XR experiences, will enable the development of a system that can guide soldiers through each step of the rigging process with precision and clarity.

These results provide significant merit to the Phase I equivalent, feasibility, and technical maturity of the proposed solution and its potential to meet the Army's stringent requirements for equipment rigging training.


Volume 2B: Technical Proposal

Technical Approach

The development of a functional AR/CV-guided rigging prototype will necessitate a multifaceted technical approach, encompassing hardware selection, software architecture design, and comprehensive data collection and model training strategies. We will develop a functional prototype that integrates AR/CV technology into a Soldier-worn device, likely a head-mounted display. This device will be equipped with sensors and processing units capable of real-time object tracking, recognition, and assessment.

The AR/CV software will provide step-by-step visual instructions overlaid onto the Soldier's view of the equipment. It will track the Soldier's actions, compare them to the expected procedure, and provide immediate feedback on correctness. The user interface will be intuitive and user-friendly, minimizing cognitive load and maximizing situational awareness.

A remote assistance capability will allow Army Parachute Riggers to connect with Soldiers in the field, providing expert guidance and support when needed. This feature will leverage the AR/CV system to share the Soldier's perspective with the Rigger, enabling effective collaboration and troubleshooting.

Hardware

The envisioned Soldier-worn device will be a head-mounted display (HMD) integrated with an array of sensors to facilitate real-time object tracking, recognition, and environmental awareness. The HMD form factor offers an optimal blend of immersion, hands-free operation, and situational awareness, crucial for soldiers operating in dynamic field environments. The sensor suite will include high-resolution RGB cameras for capturing the visual scene, depth sensors for generating 3D spatial information, and inertial measurement units (IMUs) for tracking head movements and orientation. The processing unit will be a compact, high-performance system-on-chip capable of real-time CV computations, ensuring minimal latency and a smooth AR experience. Hardware selection will prioritize modularity and open architecture to avoid vendor lock-in and facilitate future upgrades and integration with other systems.

Software

The AR/CV software will serve as the core of the rigging guidance system, providing an intuitive interface and real-time instructional overlays. The software architecture will be designed to be modular and scalable, enabling efficient integration of new features and capabilities as the project progresses. Key functionalities will include:

  • Object Recognition and Tracking: Robust CV algorithms will be employed to accurately identify and track relevant objects in the rigging environment, including equipment components, straps, and parachute systems. RedShred's expertise in extracting entities of interest from complex documents will be leveraged to enhance object recognition accuracy.

  • Procedural Guidance: The software will parse and interpret rigging procedures from digital manuals and technical orders, generating step-by-step visual instructions and 3D model overlays that guide Soldiers through each task. RedShred's experience in creating detailed task graphs from technical data will be instrumental in developing this functionality.

  • Real-time Feedback: The system will continuously monitor the Soldier's actions, comparing them to the expected procedure and providing immediate visual, auditory, or haptic feedback to ensure adherence and correct errors.

  • Remote Assistance: A secure communication channel will be established to enable Army Parachute Riggers to provide remote support and guidance. The Rigger will be able to view the Soldier's AR scene, offer verbal instructions, and even annotate the scene with virtual markers.

  • User Interface: The UI will be designed with a focus on simplicity and clarity, minimizing cognitive load and maximizing situational awareness. It will present information in a clear and concise manner, utilizing intuitive icons, visual cues, and spatial audio to guide the Soldier's attention.

Training Data

Training the AR/CV model will involve a multi-faceted approach, combining data from digital manuals, on-site acquisition, and potentially, digitization of physical assets.

  • Digital Manuals: Army-provided digital manuals and technical orders will be processed using RedShred's platform to extract structured data on rigging procedures, equipment specifications, and safety protocols. This data will serve as the foundation for training the object recognition and task assessment models.

  • On-site Data Acquisition: We will conduct site visits to rigging facilities to capture real-world scenarios, collecting data on various lighting conditions, equipment configurations, and Soldier actions. This data will be used to fine-tune the CV models and ensure their robustness in diverse operational environments.

  • Digitization of Physical Assets: If necessary, we will employ 3D scanning and photogrammetry techniques to digitize physical assets, such as parachute components and rigging equipment. This will create a comprehensive dataset for training the object recognition and tracking models, particularly for handling deformable objects.

By combining these data sources, we will create a robust and diverse training dataset that enables the AR/CV system to perform accurately and reliably in a wide range of operational conditions.

Task Assessment and Feedback

The AR/CV system will employ a multi-modal approach to task assessment and feedback, leveraging real-time computer vision analysis, machine learning algorithms, and intuitive user interface design to ensure accurate evaluation and effective guidance throughout the rigging process.

Real-time Computer Vision Analysis: The system will continuously monitor the Soldier's actions and the state of the equipment using the integrated sensor suite. Computer vision algorithms will track the position and orientation of key objects, identify task completion states, and detect any deviations from the prescribed procedures. RedShred's expertise in real-time work tracking, as demonstrated in the COACH system, will be instrumental in developing this capability.

Machine Learning-based Assessment: Machine learning models will be trained on a diverse dataset of rigging scenarios to recognize and classify correct and incorrect actions. These models will analyze the visual and spatial data captured by the sensors, enabling the system to assess the correctness of the Soldier's actions with high accuracy and provide real-time feedback.

Multi-Modal Feedback: The system will provide feedback through multiple modalities to ensure clear and effective communication:

  • Visual Feedback: The AR interface will highlight incorrect actions or components, overlay corrective instructions, and provide visual cues to guide the Soldier towards proper task completion. This could include arrows indicating the correct placement of straps, color-coded highlights for misaligned components, or animated demonstrations of the required actions.

  • Auditory Feedback: The system will provide verbal prompts and instructions, offering additional guidance and reinforcing visual cues. Auditory feedback can be particularly useful in situations where the Soldier's visual attention is focused on a specific task or when ambient noise levels are high.

  • Haptic Feedback: In certain scenarios, haptic feedback through vibrations or tactile cues may be employed to provide subtle yet effective guidance. For example, a vibration could alert the Soldier to an incorrect hand placement or a misaligned component.

  • Adaptive Feedback: The feedback mechanisms will be adaptive, adjusting to the Soldier's skill level and performance. As the Soldier progresses through the rigging process, the system will provide less explicit guidance and rely more on subtle cues and reminders, fostering independent learning and problem-solving skills.

By combining real-time computer vision analysis, machine learning-based assessment, and multi-modal feedback, the AR/CV system will create a closed-loop learning environment that empowers Soldiers to master equipment rigging tasks efficiently and accurately. The system's ability to provide immediate and actionable feedback will help prevent errors, improve training outcomes, and ultimately enhance operational readiness.

Testing and Evaluation

We will conduct extensive testing in a realistic rigging facility environment, simulating various operational scenarios and challenges. We will evaluate the system's performance based on metrics such as task completion time, error rate, and user satisfaction.

Success will be defined by the Army's requirement: four untrained Soldiers, guided solely by the AR/CV system, must be able to successfully rig a JLTV for airdrop.

Path to Phase III: Advanced Rigging Training and System Refinement

With a clear path to Phase III, with Limitless Flight as our teaming partner, we will establish a dedicated physical training facility within their state-of-the-art AR/VR parachute training center. This unique environment will enable full-scale rigging training with actual equipment and parachute systems, while mitigating risks through controlled cargo deployment and harness systems. Soldiers will gain unparalleled situational awareness and experience rigging procedures in a highly realistic yet safe setting, fostering confidence and proficiency.

This would focus on refining the AR/CV system based on extensive user feedback and performance data gathered during Phase II. We will iterate on the user interface, optimize algorithms for enhanced accuracy and robustness, and expand the system's capabilities to encompass a wider range of rigging scenarios and equipment types. The partnership with Limitless Flight will provide invaluable insights into real-world rigging challenges and user needs, ensuring the system's practicality and effectiveness in operational environments as well as explore the potential for integrating the AR/CV system with other emerging technologies, such as advanced simulation environments, to create an even more immersive and impactful training experience. We will also investigate the feasibility of deploying the system in diverse operational contexts, including austere environments and limited-connectivity scenarios, to ensure its adaptability and utility across the full spectrum of military operations.

Ultimately with the goal of the delivery of a fully mature, field-ready AR/CV rigging guidance system that empowers Soldiers to perform complex rigging tasks with confidence, precision, and safety, significantly enhancing the Army's aerial delivery capabilities and operational readiness.

Team Qualifications

Our team comprises experts in AR/CV development, software engineering, training systems, and military applications. The proposed project boasts a team of highly qualified individuals whose expertise directly aligns with the technical challenges and objectives of developing an AR/CV-based rigging guidance system. We have a proven track record of delivering innovative solutions for complex challenges. 

Key Personnel

  • Dr. Samuel H. Friedman (RedShred), US Citizen, CV Technical Lead:  Dr. Samuel H. Friedman is an expert in multiple domains, including machine learning, artificial intelligence, and computer vision. While at RedShred, Dr. Friedman has worked on perceptually-enabled task guidance (PTG) for DARPA, determining authoritativeness and lineage of documents for the Air Force, and synchronization of documents for performing Aircraft Battle Damage Repair (ABDR) for the Navy, and improving machine learning tasks. Prior to joining RedShred, Dr. Friedman worked on 30+ different SBIRs for the Navy, the Air Force, the Army, the MDA, the DLA, SOCOM, NASA, and the DOE. He served as PI for such diverse SBIRs as using physics to improve machine learning classification of ships for the Navy, Dr. Friedman has earned a B.A. in Physics with Specialization in Astronomy and Astrophysics, a B.S. in Mathematics, and a B.A. in Religion and the Humanities from the University of Chicago (2004). Dr. Friedman has earned an M.S. in Astronomy (2006) and a Ph.D. in Astronomy (2011) from the University of Wisconsin-Madison. Dr. Friedman is a U.S. Citizen and has an active Secret Security Clearance.

  • Ali Mahvan (Terasynth), US Citizen, Principal Investigator: Ali Mahvan, the CEO of Terasynth, is a seasoned technologist with a deep understanding of the intricacies of AR/VR development. His expertise extends beyond mere software engineering; he possesses a profound grasp of 3D modeling, spatial computing, and user interface design principles that are crucial for crafting immersive and intuitive AR experiences. Mr. Mahvan's leadership in successfully delivering numerous AR/VR projects, coupled with his technical proficiency, will be instrumental in steering the project's technical vision and ensuring the seamless integration of complex AR/CV components into a cohesive and user-friendly system.

  • William Taubenheim (Terasynth), US Citizen, AR Technical Lead: Mr. Taubenheim is a recognized authority in Unreal Engine development, with a proven track record of creating scalable and high-fidelity VR and simulation applications. As one of a select group of official Meta development partners, he possesses an in-depth understanding of the latest advancements in VR technology and best practices for optimizing performance and user experience. His expertise in Unreal Engine, combined with his experience in building large-scale simulations, will be invaluable in developing the robust and immersive AR environment required for effective rigging training.

  • Derrick Flitcroft (Consultant), US Citizen, Subject Matter Expert: Mr. Flitcroft's hands-on experience as an Air Force crew chief, particularly his work with C-17 and KC-135 aircraft, provides invaluable insights into the practical challenges and operational requirements of equipment rigging. His intimate familiarity with ratchet straps and cargo handling procedures will ensure that the AR/CV system is designed to meet the specific needs of Soldiers in the field.

  • RedShred: RedShred's core competencies lie in its proficiency in extracting, structuring, and interpreting complex technical data from diverse sources, including manuals, diagrams, and videos. This expertise is directly applicable to the proposed project, ensuring the AR/CV system has access to precise procedural guidance from Army rigging manuals and technical orders. Furthermore, RedShred's work on the COACH project for DARPA underscores its capability in real-time task tracking and procedural guidance, essential for monitoring soldier actions, providing immediate feedback, and offering corrective instructions during rigging. The COACH system's ability to dynamically integrate and adapt to new machine learning models ensures the AR/CV system's continuous performance improvement and incorporation of the latest advancements in object recognition, task assessment, and fault detection.

  • Terasynth: Terasynth's core strengths lie in their proficiency in developing cutting-edge AR/CV solutions, coupled with extensive experience in creating training and simulation systems, particularly for the military. Their expertise in crafting robust and user-friendly AR interfaces, along with their proficiency in computer vision algorithms, is crucial for building an effective rigging guidance system. Moreover, their focus on user-centric design aligns perfectly with the Army's objective, ensuring the development of a system that is both intuitive and effective in enhancing Soldier performance.

Past Performance

Through its USAF SBIRs, RedShred has established the lead in developing extraction and reading of textual and diagrammatic maintenance aids; and includes specialized models tailored to the maintenance domain language used in technical orders. Through its NSF SBIRs, it has extended the platform further to read data points from common infographics like bar charts and line graphs and use those extractions to understand what the charts mean. The RedShred platform’s JSON-based output has been used to let development teams ingest legacy PDF tech data for platforms such as the F-16, B-42, F-15 and C-130 and to power knowledge graphs, domain-tailored voice assistants, troubleshooting aids, and virtual and augmented reality training solutions. RedShred has patents awarded and pending for extraction technologies. It has been granted a patent for automatically assessing structured data for decision-making (f10,810,240) and has patents pending for document segmentation (17/577,793) and geographic management of document content (63/264,630).

This team's combined expertise in AR/CV development, software engineering, training systems, and military applications, coupled with their direct experience in equipment rigging and task guidance, positions them exceptionally well to deliver a successful and impactful solution that meets the Army's objectives.

Timeline & Deliverables

  • Months 1-4: Prototype Development and Initial Data Acquisition

    • Design and development of the core AR/CV software architecture

    • Selection and procurement of hardware components for the Soldier-worn device

    • Initial data collection from digital manuals and technical orders using RedShred's platform

    • Deliverables:

      • Software architecture design document

      • Hardware specifications and procurement plan

      • Initial dataset of structured rigging procedures

  • Months 5-9: Model Training and Refinement

    • On-site data acquisition at rigging facilities

    • Digitization of physical assets (if necessary)

    • Training and refinement of object recognition, tracking, and task assessment models

    • Integration of machine learning models into the AR/CV software

    • Deliverables:

      • Comprehensive training dataset

      • Trained machine learning models

      • Updated AR/CV software with integrated ML capabilities

  • Months 10-15: System Integration and Preliminary Testing

    • Integration of hardware and software components into a functional prototype

    • Controlled environment testing to validate core functionalities and assess baseline performance

    • Refinement of the user interface based on initial user feedback

    • Deliverables:

      • Functional AR/CV prototype system

      • Test reports from controlled environment testing

      • Updated user interface design

  • Months 16-20: Realistic Rigging Facility Testing and Evaluation

    • Deployment in a realistic rigging facility environment with full-scale scenarios

    • Soldier performance evaluation and feedback collection

    • Assessment of remote assistance capabilities

    • Iterative system refinement based on testing outcomes

    • Deliverables:

      • Comprehensive test reports and performance evaluations

      • Detailed user feedback and system improvement recommendations

  • Months 21-24: Final Demonstration, Refinement, and Reporting

    • Final demonstration of the refined system to Army stakeholders

    • Incorporation of final feedback and adjustments

    • Preparation of final project report and documentation

    • Deliverables:

      • Finalized AR/CV rigging guidance system

      • Final project report and documentation

This extended timeline allows for a dedicated period of testing and feedback at the end, ensuring ample opportunity to refine the system based on real-world usage and soldier input. This will result in a more robust, user-friendly, and effective final product that meets the Army's high standards for training and operational readiness.

Copyright 2024 Terasynth, Inc. All rights reserved. This document is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0): http://creativecommons.org/licenses/by-nc-nd/4.0. For licensing information contact our general mailbox at https://linkedin.com/company/terasynth.