Cutsio Blog

How to Find a Specific Shot in 100 Hours of ARRI RAW Footage

Stop scrubbing through terabytes of ARRI RAW dailies. Learn how Visual Search lets you find any shot across 100 hours of footage by describing what the camera saw — in seconds, not hours.

How do you find a specific shot in 100 hours of ARRI RAW footage without scrubbing?

Stop scrubbing. Upload ARRI RAW originals to Cutsio, which generates streamable review assets, indexes every frame with Visual Intelligence, and lets you find any shot by describing what the camera saw — "close-up of actor by window," "car chase sunset," "two-shot kitchen argument" — in seconds.

A typical narrative feature film generates 60 to 100 hours of ARRI RAW footage across a 30 to 50 day shoot. The native file formats — .ari, .mxf, and .arx — contain the full sensor data from Alexa cameras like the Alexa 35, Alexa Mini LF, and Alexa SXT. Finding a specific shot in that library using traditional methods means scrubbing through every clip, reading timecode burn-ins, flipping through the DIT's camera reports, and praying the scene you need was properly logged. This process is slow, error-prone, and wastes more editorial time than almost any other task in post-production.

Cutsio eliminates the search problem entirely. Through the enterprise ARRI RAW add-on, you upload the native files, Cutsio transcodes them into streamable review assets in the cloud, and Visual Intelligence indexes every single frame — objects, people, actions, lighting conditions, scene composition, and any scratch audio transcription. The result is a fully searchable ARRI RAW library where you find shots by describing what you need, not by watching hours of footage.

Working with raw camera footage? Check out How to Build a Searchable Library From ARRIRAW, RED R3D, and ProRes Footage.

Search your video library faster with How to Search Your Entire Video Library by Meaning.

Why is searching ARRI RAW footage manually so time-consuming?

Searching ARRI RAW footage manually is time-consuming because the tools available to most editors — bin hierarchies, thumbnail grids, and timeline scrubbing — were designed for small projects, not 100-hour feature film libraries with terabytes of raw sensor data.

Here is what the traditional search process actually looks like on a feature film:

| Step | Time Required | Failure Point |

| :--- | :--- | :--- |

| Review DIT camera report | 15–30 min per day | Reports are often incomplete or inaccurate |

| Navigate folder hierarchy | 10–20 min per scene | Folder naming conventions break down by week 2 |

| Scrub through clips visually | 2–5 min per clip | You miss the frame you need and have to re-scrub |

| Cross-reference multiple takes | 5–10 min per scene | Takes are mislabeled or unlabeled |

| Export and send selects | 15–30 min | Wrong version gets exported |

Over a 100-hour feature, the cumulative time spent just searching for the right frames can exceed 40 to 60 hours of editorial labor. That is one to two full workweeks of pure search time that could have been spent on creative editing.

Cutsio's Visual Intelligence collapses this into seconds. Instead of asking "where did I put that clip," you ask "show me the close-up where the actor delivers the line about the letter" — and the system returns the exact frame immediately.

How does Visual Search work with ARRI RAW footage?

Visual Search indexes every frame of the ARRI RAW review stream using computer vision models that recognize objects, scenes, actions, faces, and spatial relationships — then lets you query the entire library using natural language descriptions.

The technology works in three stages:

  1. Frame Analysis: During ingestion, Cutsio's Visual Intelligence engine analyzes every frame of the review stream. It identifies objects (cars, furniture, weapons, props), scenes (interior, exterior, nighttime, golden hour, office, kitchen), actions (walking, running, sitting, driving, arguing), and people (main actors, extras, crew).
  2. Semantic Indexing: Each frame is embedded into a high-dimensional vector space that maps visual concepts. "Golden hour close-up with soft shadows" and "warm-toned medium shot of actor near window" are mapped as related concepts, even though they use different words.
  3. Natural Language Querying: When a user types "show me the master shot from the living room scene where the detective finds the letter," the system compares that query against the visual index and returns matching clips ranked by relevance.

This is fundamentally different from keyword-based transcript search. Transcript search only works if there is spoken audio to transcribe. ARRI RAW footage shot MOS — common for action sequences, B-roll, and atmospheric establishing shots — has no dialogue to search. Visual Search fills this gap because it understands what the camera saw, not just what was said.

What kinds of searches work best with ARRI RAW Visual Search?

The most effective searches combine visual elements, scene context, and specific details. Examples that work well:

  • "Close-up of lead actress crying in bedroom with window light"
  • "Wide master shot of restaurant argument scene"
  • "Car driving through intersection at night — tail lights visible"
  • "Two-shot of detective and witness at kitchen table"
  • "Golden hour exterior of house establishing shot"
  • "Any take where the boom mic dips into the top of frame"

The system understands these as visual concepts, not keyword tags. Searching "car at night" will return clips where a car appears in low-light conditions, even if those exact words were never typed into any metadata field.

Cutsio

Find any ARRI RAW frame in seconds, not hours

Describe what the camera saw. Cutsio Visual Search finds the exact frame across your entire ARRI RAW library. No scrubbing, no camera reports, no guesswork.

How do you search an ARRI RAW library by scene, take, or technical metadata?

Beyond visual search, Cutsio indexes all available metadata from the ARRI RAW files — including scene numbers, take numbers, camera settings, and color metadata — so you can combine visual queries with technical filters for precision search.

When the DIT uploads ARRI RAW files organized by scene and take (standard practice on professional sets), Cutsio preserves that organizational structure. You can filter by:

  • Scene: "Show me everything from Scene 24"
  • Take: "Only return takes marked as good"
  • Camera: "A-cam footage only" or "B-cam footage only"
  • Date: "Footage from Day 3 through Day 7"
  • Lens or Focal Length: If the metadata is present in the ARRI RAW headers
  • Color Space: "Clips with ARRI LogC4 vs LogC3"

Combining these filters with Visual Search creates extremely precise queries: "Find me the close-up of the lead actor from Scene 24, A-cam, golden hour takes only." The system returns exactly the matching frames, ranked by relevance.

How do Collections help organize search results across a 100-hour ARRI RAW library?

Collections in Cutsio let you save, organize, and share search results as visual hubs — so every time you find a great shot, you can instantly add it to a curated selects reel without exporting, copying, or renaming files.

Once you find a matching shot through Visual Search, you can:

  1. Click "Add to Collection" to save it to a selects reel for a specific scene.
  2. Organize Collections by scene, character, shot type, or any custom taxonomy.
  3. Share the entire Collection as a single review link for the director or editor.
  4. Export a selects EDL from the Collection referencing the original ARRI RAW file names and timecodes.

This transforms the search workflow from "find and download" to "find, curate, and collaborate." The editor does not need to download clips to build a selects reel — they just build Collections directly in Cutsio, and the conform happens later from the original files.

How does Agentic Chat let you search ARRI RAW footage conversationally?

Agentic Chat in Cutsio is a conversational AI interface that answers natural language questions about your ARRI RAW library — such as "find all the two-shots where the lighting changes from day to night" — and returns frame-exact results without manual search syntax or folder navigation.

The production team does not need to learn how to formulate the perfect Visual Search query. They simply ask in plain English:

  • "Which scenes were shot at 48 fps for slow motion?"
  • "Show me every take where the actor enters through the door on the right."
  • "Are there any shots with a visible crew reflection in the window?"
  • "Find the clip where the focus puller misses the mark on the close-up."

Agentic Chat combines Visual Search (understanding visual content), metadata search (understanding camera settings and scene numbers), and Collection context (understanding organizational structure) to return precise results. For ARRI RAW footage specifically, Agentic Chat can also query metadata from the ARRI RAW headers when available.

How do you export search results for the NLE conform?

Once you have found the shots you need using Visual Search, you export a selects EDL or FCPXML that references the original ARRI RAW file names and timecodes — so the editor can import it directly into DaVinci Resolve, Premiere Pro, or Avid Media Composer for the conform.

The export workflow is:

  1. Search for the shots you need using Visual Search or Agentic Chat.
  2. Add matching clips to a Collection or mark them as selects.
  3. Click "Export" and choose your delivery format — EDL, FCPXML, or CSV.
  4. Open your NLE, import the EDL, and link it to the original ARRI RAW files on your local RAID.

The original ARRI RAW files are never modified. The EDL simply references them by their original file names and timecodes. The conform happens against the exact same sensor data that came off the Alexa cards.

How does storage pricing work for a 100-hour ARRI RAW library?

Cutsio's Storage uses a pay-for-minutes model that separates storage cost from file size — meaning a 100-hour ARRI RAW library (potentially 60+ TB) costs based on the total minutes of footage, not the terabytes.

This is critical for ARRI RAW workflows where a single feature film can generate 60 to 100 TB of raw data. Traditional cloud storage charges by the gigabyte, making it prohibitively expensive to keep a full ARRI RAW library online. With Cutsio, you pay for the duration of the footage, and the media remains streamable, searchable, and shareable through review assets while the original ARRI RAW files are retained as attachments for download and conform.

The review assets are always available for instant search and playback. The original .ari, .mxf, and .arx files are available for download when needed for the final grade.

FAQ

What kind of searches work with ARRI RAW footage that has no audio?

Visual Search works on the visual content alone, so MOS footage is fully searchable. You can search for objects, scenes, actions, lighting conditions, and compositions even when there is zero scratch audio.

Can I search by actor or character face in ARRI RAW footage?

Visual Search can recognize and return frames containing specific people when the footage is indexed. For productions with consistent principal faces across scenes, this makes searching for a specific actor's close-ups fast and reliable.

How long does it take to index 100 hours of ARRI RAW footage?

Indexing time depends on the total duration and the processing capacity allocated to your enterprise account. Contact the Cutsio sales team for specific timelines based on your production's volume.

Can I combine Visual Search with transcript search?

Yes. When ARRI RAW footage includes a scratch audio track, Cutsio transcribes it alongside the visual indexing. You can search by both visual content and spoken dialogue simultaneously for the most precise results.

Is Visual Search available for all Cutsio users?

Visual Search is available on all Cutsio accounts for standard uploaded video. ARRI RAW ingestion and Visual Search indexing of raw camera files is available as an enterprise add-on for qualified production accounts.

Stop scrubbing. Start searching.

A 100-hour ARRI RAW library should take seconds to search, not weeks. Cutsio indexes every frame with Visual Intelligence so you find any shot by describing what the camera saw. No scrubbing, no camera reports, no wasted editorial time.

  • Visual Search finds any frame by describing what the camera saw

  • Original ARRI RAW files attached for conform and finishing

  • Collections and Agentic Chat for team-wide collaboration

class="no-underline inline-flex items-center justify-center rounded-full bg-indigo-600 px-8 py-3.5 text-sm font-semibold text-white hover:bg-indigo-700 dark:bg-white dark:text-slate-900 dark:hover:bg-neutral-100 transition-colors shadow-sm">

Try Cutsio Free

No credit card required. 60 minutes of free processing.