# Machine Parse

## Machine Parse

### Subtitles

**OCR**

Subtitles can be ocr with the --ocr command. This will run one of two ocr engines. Easyocr tends to give better results, but it has about 100 languages. Tesseract is able to ocr over 130 languages.&#x20;

Currently the use of this is to help with ID tracks.

This process can take a while, so by default it is turned off

To speed things up a bit by default only the last lines, and the first 50 lines will be ocr.

You can change this by changing the maxOCRLineCount value in config.py => see advanced subpages in doc

**Subimages**

If a sub is selected to be ocr then temp images will be generated. By default the value will be set to none, which means no subtitles will be ocr

If you want to keep the temp images  generated for the ocr process permanently you can pass both the --ocr and --keepocr arguments

#### **Subimages** images without OCR

It is also possible to generate subimages for all enabled tracks without having to process them with ocr.&#x20;

pass the  --keepocr language argument, without --ocr argument

As a note if you want a little more control of what tracks are enabled you can pass both the&#x20;

\--sublang arguments and --sortpref arguments

### voiceRec

voiceRec is not as good as ocr. But it can still kind of be used to ID tracks. Use the --voicerec argument  to use this feature. To save time it will only record about 50 lines worth of the track.

You can change this by changing the maxVoiceLineCount value in config.py => see advanced subpages in doc


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ptmuxer.gitbook.io/api-docs/demux/machine-parse.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
