4.7 KiB
UFF Search - Unsorted Folder Full-Text Search
UFF Search is a powerful desktop application for Windows that allows you to perform fast, intelligent, and fuzzy full-text searches on your local files, including searching inside ZIP archives.
It builds a local search index for the folders you specify, allowing you to quickly find documents based on their meaning (semantic search) and specific keywords, even with typos in your search query.
Key Features
- Hybrid Search: Combines state-of-the-art semantic search (understanding the meaning of your query) with traditional keyword search (finding exact words). This delivers more relevant results than simple text matching.
- ZIP Archive Search: Indexes and searches the content of files inside
.ziparchives. - Fuzzy Search: Finds relevant files even if your search term has typos, powered by
rapidfuzz. - Wide File Type Support: Extracts text from:
- PDFs (
.pdf) - Microsoft Office (
.docx,.xlsx,.pptx) - Plain text formats (
.txt,.md,.py,.json,.csv,.html,.log,.ini,.xml)
- PDFs (
- Simple UI: An easy-to-use interface to manage your indexed folders and view search results.
- Click to Open: Search results can be clicked to open the file directly (or the containing ZIP archive).
- Self-Contained: Stores its index and all data in your local application data folder for privacy and portability.
How It Works
UFF Search uses a two-pronged approach for searching:
- Semantic Search: When you search, your query is converted into a numerical representation (a vector) using the
all-MiniLM-L6-v2sentence-transformer model. The application finds files whose content is semantically similar to your query. - Keyword Search: The application also uses a traditional full-text search (SQLite FTS5) and fuzzy matching to find files containing the exact keywords in your query.
A hybrid scoring system ranks the results, giving you the best of both worlds.
Installation
From Source
To run the application from the source code, you'll need Python 3.
-
Clone the repository:
git clone https://github.com/BildoBeucklin/unsorted-folder-full-text-search.git cd unsorted-folder-full-text-search -
Install dependencies: It is highly recommended to use a virtual environment.
pip install -r requirements.txt -
Run the application:
python main.py
Building from Source
To create a standalone executable from the source code, you can use pyinstaller:
-
Install PyInstaller:
pip install pyinstaller -
Build the executable:
pyinstaller --noconfirm --onedir --windowed --add-data "assets;assets" --icon "assets/favicon.ico" main.pyOr:
pyinstaller main.spec
Both of these commands will create a single executable file in the dist folder. It may take some time to build.
Usage
(texts are only in german)
- Start the application.
- Click " + Hinzufügen" (Add) to select a folder you want to index. The application will start scanning it immediately.
- Once indexing is complete, type your search query into the search bar and press Enter or click "Suchen" (Search).
- Results will appear below. Click on any result to open the file. If the file is inside a ZIP archive, the ZIP file will be opened.
- To re-scan a folder for changes, select it from the list and click "↻ Neu scannen" (Rescan).
- To remove a folder, select it and click " - Entfernen" (Remove).
Technical Details
- Framework: PyQt6
- Database: SQLite with FTS5 for full-text indexing.
- Search Technology:
sentence-transformers(specificallyall-MiniLM-L6-v2) for semantic search.rapidfuzzfor fuzzy string matching.
- File Processing:
pdfplumberfor PDF text extraction.python-docxfor.docxfiles.openpyxlfor.xlsxfiles.python-pptxfor.pptxfiles.
- Index Location: The search index database (
uff_index.db) is stored in%LOCALAPPDATA%\UFF_Searchon Windows. - Size: (ca. 400-600 MB)
License
This project is licensed under the GNU Affero General Public License v3.0. See the LICENSE file for details. This license requires that if you use this software in a product or service that is accessed over a network, you must also make the source code available to the users of that product or service.
Contact & Support
You can contact me at: https://rossmann-it-solutions.de/contact/
