3
.gitignore
vendored
3
.gitignore
vendored
@@ -3,3 +3,6 @@
|
||||
/dist
|
||||
/build
|
||||
__pycache__/
|
||||
*.iss
|
||||
UFF-Search.spec
|
||||
/Output
|
||||
83
README.md
83
README.md
@@ -1,52 +1,101 @@
|
||||
# UFF Search
|
||||
[](https://github.com/BildoBeucklin/unsorted-folder-full-text-search)
|
||||
|
||||
UFF Search is a desktop application for Windows that allows you to perform fast, fuzzy full-text searches on your local files.
|
||||
# UFF Search - Unsorted Folder Full-Text Search
|
||||
|
||||
It builds a search index for the folders you specify, allowing you to quickly find documents even with typos in your search query.
|
||||
UFF Search is a powerful desktop application for Windows that allows you to perform fast, intelligent, and fuzzy full-text searches on your local files, including searching inside ZIP archives.
|
||||
|
||||
## Features
|
||||
It builds a local search index for the folders you specify, allowing you to quickly find documents based on their meaning (semantic search) and specific keywords, even with typos in your search query.
|
||||
|
||||
* **Local Full-Text Search:** Indexes and searches the content of files in your selected folders.
|
||||
## Key Features
|
||||
|
||||
* **Hybrid Search:** Combines state-of-the-art **semantic search** (understanding the *meaning* of your query) with traditional **keyword search** (finding exact words). This delivers more relevant results than simple text matching.
|
||||
* **ZIP Archive Search:** Indexes and searches the content of files *inside* `.zip` archives.
|
||||
* **Fuzzy Search:** Finds relevant files even if your search term has typos, powered by `rapidfuzz`.
|
||||
* **Wide File Type Support:** Extracts text from PDFs, and various plain text formats (`.txt`, `.md`, `.py`, `.json`, `.csv`, `.html`, `.log`, `.ini`, `.xml`).
|
||||
* **Wide File Type Support:** Extracts text from:
|
||||
* PDFs (`.pdf`)
|
||||
* Microsoft Office (`.docx`, `.xlsx`, `.pptx`)
|
||||
* Plain text formats (`.txt`, `.md`, `.py`, `.json`, `.csv`, `.html`, `.log`, `.ini`, `.xml`)
|
||||
* **Simple UI:** An easy-to-use interface to manage your indexed folders and view search results.
|
||||
* **Click to Open:** Search results can be clicked to open the file directly.
|
||||
* **Self-Contained:** Stores its index in your local application data folder.
|
||||
* **Click to Open:** Search results can be clicked to open the file directly (or the containing ZIP archive).
|
||||
* **Self-Contained:** Stores its index and all data in your local application data folder for privacy and portability.
|
||||
|
||||
## How It Works
|
||||
|
||||
UFF Search uses a two-pronged approach for searching:
|
||||
|
||||
1. **Semantic Search:** When you search, your query is converted into a numerical representation (a vector) using the `all-MiniLM-L6-v2` sentence-transformer model. The application finds files whose content is semantically similar to your query.
|
||||
2. **Keyword Search:** The application also uses a traditional full-text search (SQLite FTS5) and fuzzy matching to find files containing the exact keywords in your query.
|
||||
|
||||
A hybrid scoring system ranks the results, giving you the best of both worlds.
|
||||
|
||||
## Installation
|
||||
|
||||
### Windows Installer
|
||||
A pre-built installer (`UFF_Search_Installer_v3.exe`) is available for easy installation.
|
||||
|
||||
### From Source
|
||||
To run the application from the source code, you'll need Python and the following dependencies:
|
||||
To run the application from the source code, you'll need Python 3.
|
||||
|
||||
1. **Clone the repository:**
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
git clone https://github.com/BildoBeucklin/unsorted-folder-full-text-search.git
|
||||
cd unsorted-folder-full-text-search
|
||||
```
|
||||
|
||||
2. **Install dependencies:**
|
||||
It is recommended to use a virtual environment.
|
||||
It is highly recommended to use a virtual environment.
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. **Run the application:**
|
||||
```bash
|
||||
python uff_app.py
|
||||
python main.py
|
||||
```
|
||||
|
||||
## Usage
|
||||
## Building from Source
|
||||
|
||||
To create a standalone executable from the source code, you can use `pyinstaller`:
|
||||
|
||||
1. **Install PyInstaller:**
|
||||
```bash
|
||||
pip install pyinstaller
|
||||
```
|
||||
|
||||
2. **Build the executable:**
|
||||
```bash
|
||||
pyinstaller --noconfirm --onedir --windowed --add-data "assets;assets" --icon "assets/favicon.ico" main.py
|
||||
```
|
||||
Or:
|
||||
```bash
|
||||
pyinstaller main.spec
|
||||
```
|
||||
|
||||
Both of these commands will create a single executable file in the `dist` folder. It may take some time to build.
|
||||
|
||||
|
||||
## Usage
|
||||
(texts are only in german)
|
||||
1. Start the application.
|
||||
2. Click **" + Hinzufügen"** (Add) to select a folder you want to index. The application will start scanning it immediately.
|
||||
3. Once indexing is complete, type your search query into the search bar and press Enter or click **"Suchen"** (Search).
|
||||
4. Results will appear below. Click on any result to open the file.
|
||||
4. Results will appear below. Click on any result to open the file. If the file is inside a ZIP archive, the ZIP file will be opened.
|
||||
5. To re-scan a folder for changes, select it from the list and click **"↻ Neu scannen"** (Rescan).
|
||||
6. To remove a folder, select it and click **" - Entfernen"** (Remove).
|
||||
|
||||
## Technical Details
|
||||
|
||||
* **Framework:** PyQt6
|
||||
* **Database:** SQLite with FTS5 for full-text indexing.
|
||||
* **Search Technology:**
|
||||
* `sentence-transformers` (specifically `all-MiniLM-L6-v2`) for semantic search.
|
||||
* `rapidfuzz` for fuzzy string matching.
|
||||
* **File Processing:**
|
||||
* `pdfplumber` for PDF text extraction.
|
||||
* `python-docx` for `.docx` files.
|
||||
* `openpyxl` for `.xlsx` files.
|
||||
* `python-pptx` for `.pptx` files.
|
||||
* **Index Location:** The search index database (`uff_index.db`) is stored in `%LOCALAPPDATA%\UFF_Search` on Windows.
|
||||
* **Size:** (ca. 400-600 MB)
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the GNU Affero General Public License v3.0. See the [LICENSE](LICENSE) file for details.
|
||||
This license requires that if you use this software in a product or service that is accessed over a network, you must also make the source code available to the users of that product or service.
|
||||
@@ -1,38 +0,0 @@
|
||||
# -*- mode: python ; coding: utf-8 -*-
|
||||
|
||||
|
||||
a = Analysis(
|
||||
['uff_app.py'],
|
||||
pathex=[],
|
||||
binaries=[],
|
||||
datas=[],
|
||||
hiddenimports=['rapidfuzz', 'pypdf'],
|
||||
hookspath=[],
|
||||
hooksconfig={},
|
||||
runtime_hooks=[],
|
||||
excludes=[],
|
||||
noarchive=False,
|
||||
optimize=0,
|
||||
)
|
||||
pyz = PYZ(a.pure)
|
||||
|
||||
exe = EXE(
|
||||
pyz,
|
||||
a.scripts,
|
||||
a.binaries,
|
||||
a.datas,
|
||||
[],
|
||||
name='UFF-Search',
|
||||
debug=False,
|
||||
bootloader_ignore_signals=False,
|
||||
strip=False,
|
||||
upx=True,
|
||||
upx_exclude=[],
|
||||
runtime_tmpdir=None,
|
||||
console=False,
|
||||
disable_windowed_traceback=False,
|
||||
argv_emulation=False,
|
||||
target_arch=None,
|
||||
codesign_identity=None,
|
||||
entitlements_file=None,
|
||||
)
|
||||
Binary file not shown.
BIN
assets/favicon.ico
Normal file
BIN
assets/favicon.ico
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 73 KiB |
BIN
assets/uff_banner.jpeg
Normal file
BIN
assets/uff_banner.jpeg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 63 KiB |
BIN
assets/uff_icon.jpeg
Normal file
BIN
assets/uff_icon.jpeg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 153 KiB |
81
config.py
Normal file
81
config.py
Normal file
@@ -0,0 +1,81 @@
|
||||
# config.py
|
||||
import sys
|
||||
import os
|
||||
|
||||
# --- PFADE ---
|
||||
if os.name == 'nt':
|
||||
base_dir = os.getenv('LOCALAPPDATA')
|
||||
else:
|
||||
base_dir = os.path.join(os.path.expanduser("~"), ".local", "share")
|
||||
|
||||
APP_DATA_DIR = os.path.join(base_dir, "UFF_Search")
|
||||
if not os.path.exists(APP_DATA_DIR):
|
||||
os.makedirs(APP_DATA_DIR)
|
||||
|
||||
DB_NAME = os.path.join(APP_DATA_DIR, "uff_index.db")
|
||||
LOG_FILE = os.path.join(APP_DATA_DIR, "uff.log")
|
||||
|
||||
def resource_path(relative_path):
|
||||
"""
|
||||
Holt den absoluten Pfad zu Ressourcen.
|
||||
Funktioniert für Dev-Modus UND für PyInstaller EXE (_MEIPASS).
|
||||
"""
|
||||
try:
|
||||
# PyInstaller erstellt temporären Ordner _MEIPASS
|
||||
base_path = sys._MEIPASS
|
||||
except Exception:
|
||||
base_path = os.path.abspath(".")
|
||||
|
||||
return os.path.join(base_path, relative_path)
|
||||
|
||||
# --- LOGGING KLASSE ---
|
||||
class Logger(object):
|
||||
def __init__(self):
|
||||
self.terminal = sys.stdout
|
||||
self.log = open(LOG_FILE, "w", encoding="utf-8")
|
||||
|
||||
def write(self, message):
|
||||
|
||||
|
||||
self.log.write(message)
|
||||
self.log.flush()
|
||||
|
||||
def flush(self):
|
||||
self.log.flush()
|
||||
|
||||
# --- AKTIVIERUNG DES LOGGERS ---
|
||||
# Das passiert jetzt sofort beim Import dieser Datei!
|
||||
sys.stdout = Logger()
|
||||
sys.stderr = sys.stdout # Fehler auch ins Log umleiten
|
||||
|
||||
print(f"--- LOGGER START ---")
|
||||
print(f"Logfile: {LOG_FILE}")
|
||||
|
||||
|
||||
# --- QT MESSAGE HANDLER (Filter) ---
|
||||
def qt_message_handler(mode, context, message):
|
||||
msg_lower = message.lower()
|
||||
ignore = ["qt.text.font", "qt.qpa.fonts", "opentype", "directwrite", "fontbbox", "script"]
|
||||
if any(k in msg_lower for k in ignore): return
|
||||
try:
|
||||
sys.stdout.write(f"[Qt] {message}\n")
|
||||
except: pass
|
||||
|
||||
# --- STYLESHEET ---
|
||||
STYLESHEET = """
|
||||
QMainWindow { background-color: #f4f7f6; }
|
||||
QFrame#Sidebar { background-color: #2c3e50; border: none; }
|
||||
QLabel#SidebarTitle { color: #ecf0f1; font-weight: bold; font-size: 16px; padding: 10px; }
|
||||
QListWidget { background-color: #34495e; color: #ecf0f1; border: none; font-size: 13px; }
|
||||
QListWidget::item { padding: 8px; border-bottom: 1px solid #2c3e50; }
|
||||
QListWidget::item:selected { background-color: #1abc9c; color: white; }
|
||||
QPushButton#SidebarBtn { background-color: #34495e; color: #bdc3c7; border: 1px solid #2c3e50; padding: 8px; text-align: left; border-radius: 4px; margin: 2px 10px; }
|
||||
QPushButton#SidebarBtn:hover { background-color: #1abc9c; color: white; border: 1px solid #16a085; }
|
||||
QPushButton#CancelBtn { background-color: #e74c3c; color: white; font-weight: bold; border-radius: 4px; margin: 10px; padding: 8px; }
|
||||
QLineEdit { padding: 10px; border: 1px solid #bdc3c7; border-radius: 20px; font-size: 14px; background-color: white; }
|
||||
QLineEdit:focus { border: 2px solid #3498db; }
|
||||
QPushButton#SearchBtn { background-color: #3498db; color: white; font-weight: bold; border-radius: 20px; padding: 10px 20px; font-size: 14px; }
|
||||
QPushButton#SearchBtn:hover { background-color: #2980b9; }
|
||||
QScrollArea { border: none; background-color: transparent; }
|
||||
QWidget#ResultsContainer { background-color: transparent; }
|
||||
"""
|
||||
177
database.py
Normal file
177
database.py
Normal file
@@ -0,0 +1,177 @@
|
||||
# database.py
|
||||
import sqlite3
|
||||
import os
|
||||
import numpy as np
|
||||
import traceback
|
||||
from sentence_transformers import util
|
||||
from rapidfuzz import fuzz
|
||||
from config import DB_NAME, APP_DATA_DIR
|
||||
|
||||
class DatabaseHandler:
|
||||
"""
|
||||
Handles all database operations, including initialization,
|
||||
folder management, and searching.
|
||||
"""
|
||||
def __init__(self):
|
||||
"""
|
||||
Initializes the DatabaseHandler, sets up the database path,
|
||||
and initializes the database schema.
|
||||
"""
|
||||
self.app_data_dir = APP_DATA_DIR
|
||||
self.db_name = DB_NAME
|
||||
self.model = None
|
||||
self.init_db()
|
||||
|
||||
def init_db(self):
|
||||
"""
|
||||
Initializes the database schema by creating the necessary tables
|
||||
(documents, folders, embeddings) if they don't already exist.
|
||||
"""
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("CREATE VIRTUAL TABLE IF NOT EXISTS documents USING fts5(filename, path, content);")
|
||||
cursor.execute("CREATE TABLE IF NOT EXISTS folders (path TEXT PRIMARY KEY, alias TEXT);")
|
||||
cursor.execute("CREATE TABLE IF NOT EXISTS embeddings (doc_id INTEGER PRIMARY KEY, vec BLOB);")
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def add_folder(self, path):
|
||||
"""
|
||||
Adds a new folder path to the database to be indexed.
|
||||
|
||||
Args:
|
||||
path (str): The absolute path of the folder to add.
|
||||
|
||||
Returns:
|
||||
bool: True if the folder was added successfully, False otherwise.
|
||||
"""
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
try:
|
||||
conn.execute("INSERT OR IGNORE INTO folders (path, alias) VALUES (?, ?)", (path, os.path.basename(path)))
|
||||
conn.commit()
|
||||
return True
|
||||
except Exception:
|
||||
return False
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def remove_folder(self, path):
|
||||
"""
|
||||
Removes a folder and all its associated indexed files from the database.
|
||||
|
||||
Args:
|
||||
path (str): The absolute path of the folder to remove.
|
||||
"""
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
cursor = conn.cursor()
|
||||
# Find all document IDs associated with the folder path
|
||||
cursor.execute("SELECT rowid FROM documents WHERE path LIKE ?", (f"{path}%",))
|
||||
ids = [row[0] for row in cursor.fetchall()]
|
||||
if ids:
|
||||
# Delete documents and their embeddings
|
||||
cursor.execute("DELETE FROM documents WHERE path LIKE ?", (f"{path}%",))
|
||||
placeholders = ','.join('?' * len(ids))
|
||||
cursor.execute(f"DELETE FROM embeddings WHERE doc_id IN ({placeholders})", ids)
|
||||
# Remove the folder entry
|
||||
cursor.execute("DELETE FROM folders WHERE path = ?", (path,))
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def get_folders(self):
|
||||
"""
|
||||
Retrieves a list of all indexed folder paths.
|
||||
|
||||
Returns:
|
||||
list: A list of folder paths.
|
||||
"""
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
rows = conn.execute("SELECT path FROM folders").fetchall()
|
||||
conn.close()
|
||||
return [r[0] for r in rows]
|
||||
|
||||
def search(self, query):
|
||||
"""
|
||||
Performs a hybrid search combining semantic and lexical (keyword) search.
|
||||
|
||||
Args:
|
||||
query (str): The search query.
|
||||
|
||||
Returns:
|
||||
list: A list of search results, each containing
|
||||
(filename, path, snippet).
|
||||
"""
|
||||
# Safety check
|
||||
if not query.strip() or not self.model:
|
||||
return []
|
||||
|
||||
try:
|
||||
# 1. Semantic Preparation
|
||||
q_vec = self.model.encode(query, convert_to_tensor=False)
|
||||
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Load embeddings
|
||||
cursor.execute("SELECT doc_id, vec FROM embeddings")
|
||||
data = cursor.fetchall()
|
||||
doc_ids = [d[0] for d in data]
|
||||
|
||||
if not doc_ids:
|
||||
conn.close()
|
||||
return []
|
||||
|
||||
# Convert BLOB -> Numpy Array
|
||||
# This can fail if the DB is corrupt or dimensions mismatch
|
||||
vecs = np.array([np.frombuffer(d[1], dtype=np.float32) for d in data])
|
||||
|
||||
# Calculate Cosine Similarity
|
||||
scores = util.cos_sim(q_vec, vecs)[0].numpy()
|
||||
scores = np.clip(scores, 0, 1)
|
||||
sem_map = {did: float(s) for did, s in zip(doc_ids, scores)}
|
||||
|
||||
# 2. Lexical Search (FTS)
|
||||
words = query.replace('"', '').split()
|
||||
if not words: words = [query]
|
||||
fts_query = " OR ".join([f'"{w}"*' for w in words])
|
||||
|
||||
try:
|
||||
fts_rows = cursor.execute("SELECT rowid, filename, content FROM documents WHERE documents MATCH ? LIMIT 100", (fts_query,)).fetchall()
|
||||
except Exception as e:
|
||||
print(f"FTS Error (ignored): {e}")
|
||||
fts_rows = []
|
||||
|
||||
lex_map = {}
|
||||
for did, fname, content in fts_rows:
|
||||
r1 = fuzz.partial_ratio(query.lower(), fname.lower())
|
||||
# Truncate content for performance
|
||||
r2 = fuzz.partial_token_set_ratio(query.lower(), content[:5000].lower())
|
||||
lex_map[did] = max(r1, r2) / 100.0
|
||||
|
||||
# 3. Hybrid Fusion
|
||||
final = {}
|
||||
ALPHA = 0.65 # Weight for semantic score
|
||||
BETA = 0.35 # Weight for lexical score
|
||||
for did, s_score in sem_map.items():
|
||||
if s_score < 0.15 and did not in lex_map: continue
|
||||
l_score = lex_map.get(did, 0.0)
|
||||
h_score = (s_score * ALPHA) + (l_score * BETA)
|
||||
# Small boost if both scores are good
|
||||
if s_score > 0.4 and l_score > 0.6: h_score += 0.1
|
||||
final[did] = h_score
|
||||
|
||||
# 4. Fetch Results
|
||||
sorted_ids = sorted(final.keys(), key=lambda x: final[x], reverse=True)[:50]
|
||||
results = []
|
||||
for did in sorted_ids:
|
||||
row = cursor.execute("SELECT filename, path, snippet(documents, 2, '<b>', '</b>', '...', 15) FROM documents WHERE rowid = ?", (did,)).fetchone()
|
||||
if row: results.append(row)
|
||||
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
# NEW: This part writes the error to the log file
|
||||
print(f"!!! CRITICAL ERROR IN SEARCH !!!")
|
||||
print(f"Error: {e}")
|
||||
print(traceback.format_exc())
|
||||
return []
|
||||
189
indexer.py
Normal file
189
indexer.py
Normal file
@@ -0,0 +1,189 @@
|
||||
# indexer.py
|
||||
import os
|
||||
import sqlite3
|
||||
import pdfplumber
|
||||
import zipfile
|
||||
import io
|
||||
from PyQt6.QtCore import QThread, pyqtSignal
|
||||
|
||||
# Optional library imports
|
||||
try: import docx
|
||||
except ImportError: docx = None
|
||||
try: import openpyxl
|
||||
except ImportError: openpyxl = None
|
||||
try: from pptx import Presentation
|
||||
except ImportError: Presentation = None
|
||||
|
||||
class IndexerThread(QThread):
|
||||
"""
|
||||
A QThread that indexes files in a given folder, extracts their text content,
|
||||
and stores it in a database along with semantic embeddings.
|
||||
"""
|
||||
progress_signal = pyqtSignal(str)
|
||||
finished_signal = pyqtSignal(int, int, bool)
|
||||
|
||||
def __init__(self, folder, db_name, model):
|
||||
"""
|
||||
Initializes the IndexerThread.
|
||||
|
||||
Args:
|
||||
folder (str): The path to the folder to be indexed.
|
||||
db_name (str): The name of the SQLite database file.
|
||||
model: The sentence-transformer model for creating embeddings.
|
||||
"""
|
||||
super().__init__()
|
||||
self.folder_path = folder
|
||||
self.db_name = db_name
|
||||
self.model = model
|
||||
self.is_running = True
|
||||
|
||||
def stop(self):
|
||||
"""Stops the indexing process."""
|
||||
self.is_running = False
|
||||
|
||||
def _extract_text(self, stream, filename):
|
||||
"""
|
||||
Extracts text from a file stream based on its extension.
|
||||
|
||||
Args:
|
||||
stream (io.BytesIO): The file stream to read from.
|
||||
filename (str): The name of the file.
|
||||
|
||||
Returns:
|
||||
str: The extracted text content.
|
||||
"""
|
||||
ext = os.path.splitext(filename)[1].lower()
|
||||
text = ""
|
||||
try:
|
||||
if ext == ".pdf":
|
||||
try:
|
||||
with pdfplumber.open(stream) as pdf:
|
||||
for p in pdf.pages:
|
||||
if t := p.extract_text(): text += t + "\n"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
elif ext == ".docx" and docx:
|
||||
try:
|
||||
doc = docx.Document(stream)
|
||||
for para in doc.paragraphs: text += para.text + "\n"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
elif ext == ".xlsx" and openpyxl:
|
||||
try:
|
||||
wb = openpyxl.load_workbook(stream, data_only=True, read_only=True)
|
||||
for sheet in wb.worksheets:
|
||||
text += f"\n--- {sheet.title} ---\n"
|
||||
for row in sheet.iter_rows(values_only=True):
|
||||
row_text = " ".join([str(c) for c in row if c is not None])
|
||||
if row_text.strip(): text += row_text + "\n"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
elif ext == ".pptx" and Presentation:
|
||||
try:
|
||||
prs = Presentation(stream)
|
||||
for i, slide in enumerate(prs.slides):
|
||||
text += f"\n--- Slide {i+1} ---\n"
|
||||
for shape in slide.shapes:
|
||||
if shape.has_text_frame:
|
||||
for p in shape.text_frame.paragraphs:
|
||||
for r in p.runs: text += r.text + " "
|
||||
text += "\n"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
elif ext in [".txt", ".md", ".py", ".json", ".csv", ".html", ".log", ".ini", ".xml"]:
|
||||
try:
|
||||
content = stream.read()
|
||||
if isinstance(content, str): text = content
|
||||
else: text = content.decode('utf-8', errors='ignore')
|
||||
except Exception:
|
||||
pass
|
||||
except Exception:
|
||||
pass
|
||||
return text
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
Starts the indexing process.
|
||||
|
||||
Iterates through files in the specified folder, extracts text,
|
||||
and saves it to the database. Emits progress and finished signals.
|
||||
"""
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Cleanup old entries for the folder
|
||||
cursor.execute("SELECT rowid FROM documents WHERE path LIKE ?", (f"{self.folder_path}%",))
|
||||
ids = [r[0] for r in cursor.fetchall()]
|
||||
if ids:
|
||||
cursor.execute("DELETE FROM documents WHERE path LIKE ?", (f"{self.folder_path}%",))
|
||||
placeholders = ','.join('?' * len(ids))
|
||||
cursor.execute(f"DELETE FROM embeddings WHERE doc_id IN ({placeholders})", ids)
|
||||
conn.commit()
|
||||
|
||||
indexed = 0
|
||||
skipped = 0
|
||||
cancelled = False
|
||||
|
||||
for root, dirs, files in os.walk(self.folder_path):
|
||||
if not self.is_running:
|
||||
cancelled = True
|
||||
break
|
||||
for file in files:
|
||||
if not self.is_running:
|
||||
cancelled = True
|
||||
break
|
||||
path = os.path.join(root, file)
|
||||
self.progress_signal.emit(f"Checking: {file}...")
|
||||
|
||||
if file.lower().endswith('.zip'):
|
||||
try:
|
||||
with zipfile.ZipFile(path, 'r') as z:
|
||||
for zi in z.infolist():
|
||||
if zi.is_dir(): continue
|
||||
vpath = f"{path} :: {zi.filename}"
|
||||
with z.open(zi) as zf:
|
||||
content = self._extract_text(io.BytesIO(zf.read()), zi.filename)
|
||||
if content and len(content.strip()) > 20:
|
||||
self._save(cursor, zi.filename, vpath, content)
|
||||
indexed += 1
|
||||
except Exception:
|
||||
skipped += 1
|
||||
else:
|
||||
try:
|
||||
with open(path, "rb") as f:
|
||||
file_content = io.BytesIO(f.read())
|
||||
content = self._extract_text(file_content, file)
|
||||
if content and len(content.strip()) > 20:
|
||||
self._save(cursor, file, path, content)
|
||||
indexed += 1
|
||||
else:
|
||||
skipped += 1
|
||||
except Exception:
|
||||
skipped += 1
|
||||
|
||||
if cancelled:
|
||||
break
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
self.finished_signal.emit(indexed, skipped, cancelled)
|
||||
|
||||
def _save(self, cursor, fname, path, content):
|
||||
"""
|
||||
Saves the extracted content and its embedding to the database.
|
||||
|
||||
Args:
|
||||
cursor: The database cursor.
|
||||
fname (str): The name of the file.
|
||||
path (str): The full path to the file.
|
||||
content (str): The extracted text content.
|
||||
"""
|
||||
cursor.execute("INSERT INTO documents (filename, path, content) VALUES (?, ?, ?)", (fname, path, content))
|
||||
did = cursor.lastrowid
|
||||
# Truncate content for embedding to avoid excessive memory usage
|
||||
vec = self.model.encode(content[:8000], convert_to_tensor=False).tobytes()
|
||||
cursor.execute("INSERT INTO embeddings (doc_id, vec) VALUES (?, ?)", (did, vec))
|
||||
77
main.py
Normal file
77
main.py
Normal file
@@ -0,0 +1,77 @@
|
||||
# main.py
|
||||
import sys
|
||||
import os
|
||||
import time
|
||||
from PyQt6.QtWidgets import QApplication
|
||||
from PyQt6.QtGui import QPixmap, QFont, QIcon
|
||||
from PyQt6.QtCore import qInstallMessageHandler, QTimer, Qt
|
||||
|
||||
# Config zuerst!
|
||||
from config import qt_message_handler, LOG_FILE, resource_path
|
||||
|
||||
from ui import UffWindow, ModernSplashScreen, ModelLoaderThread
|
||||
|
||||
qInstallMessageHandler(qt_message_handler)
|
||||
os.environ["QT_LOGGING_RULES"] = "qt.text.font.db=false;qt.qpa.fonts=false"
|
||||
|
||||
if __name__ == "__main__":
|
||||
try:
|
||||
app = QApplication(sys.argv)
|
||||
app.setFont(QFont("Segoe UI", 10))
|
||||
|
||||
icon_path = resource_path("assets/uff_icon.jpeg") # <--- HIER
|
||||
if os.path.exists(icon_path):
|
||||
app_icon = QIcon(icon_path)
|
||||
app.setWindowIcon(app_icon)
|
||||
|
||||
# SPLASH LADEN (Mit resource_path)
|
||||
banner_path = resource_path("assets/uff_banner.jpeg")
|
||||
splash_pix = QPixmap(banner_path)
|
||||
if splash_pix.isNull():
|
||||
splash_pix = QPixmap(600, 400)
|
||||
splash_pix.fill(Qt.GlobalColor.white)
|
||||
|
||||
splash = ModernSplashScreen(splash_pix)
|
||||
splash.show()
|
||||
|
||||
|
||||
splash.set_progress(10, "Lade Konfiguration...")
|
||||
app.processEvents()
|
||||
time.sleep(0.3) # Nur für den Effekt
|
||||
|
||||
splash.set_progress(30, "Verbinde Datenbank...")
|
||||
app.processEvents()
|
||||
|
||||
# Hauptfenster erstellen (aber noch versteckt lassen)
|
||||
window = UffWindow()
|
||||
|
||||
splash.set_progress(50, "Lade Benutzeroberfläche...")
|
||||
app.processEvents()
|
||||
time.sleep(0.2)
|
||||
|
||||
# 4. DAS SCHWERE KI-MODELL LADEN
|
||||
splash.set_progress(60, "Lade KI-Modell (das dauert kurz)...")
|
||||
app.processEvents()
|
||||
|
||||
# Wir starten den Thread, aber wir müssen warten bis er fertig ist,
|
||||
# bevor wir den Splash schließen.
|
||||
loader = ModelLoaderThread()
|
||||
|
||||
def on_loaded(model):
|
||||
splash.set_progress(100, "Fertig!")
|
||||
app.processEvents()
|
||||
time.sleep(0.5) # Kurz warten bei 100%
|
||||
|
||||
window.on_model_loaded(model) # Modell an Fenster übergeben
|
||||
window.show() # Fenster zeigen
|
||||
splash.finish(window) # Splash schließen
|
||||
|
||||
loader.model_loaded.connect(on_loaded)
|
||||
loader.start()
|
||||
|
||||
sys.exit(app.exec())
|
||||
|
||||
except Exception as e:
|
||||
import traceback
|
||||
print("CRITICAL MAIN CRASH:")
|
||||
print(traceback.format_exc())
|
||||
72
main.spec
Normal file
72
main.spec
Normal file
@@ -0,0 +1,72 @@
|
||||
# -*- mode: python ; coding: utf-8 -*-
|
||||
from PyInstaller.utils.hooks import collect_all
|
||||
|
||||
# --- 1. SPEZIELLE BIBLIOTHEKEN SAMMELN ---
|
||||
# sentence_transformers und rapidfuzz sind komplex, wir holen alles automatisch
|
||||
datas = [('assets', 'assets')]
|
||||
binaries = []
|
||||
hiddenimports = [
|
||||
'docx',
|
||||
'openpyxl',
|
||||
'pptx',
|
||||
'pdfplumber',
|
||||
'rapidfuzz',
|
||||
'sentence_transformers',
|
||||
'numpy'
|
||||
]
|
||||
|
||||
# Sammle alle Daten für die KI-Bibliothek (verhindert Import-Fehler)
|
||||
tmp_ret = collect_all('sentence_transformers')
|
||||
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
|
||||
|
||||
# Sammle rapidfuzz sicherheitshalber auch komplett
|
||||
tmp_ret = collect_all('rapidfuzz')
|
||||
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
|
||||
|
||||
|
||||
# --- 2. ANALYSE ---
|
||||
a = Analysis(
|
||||
['main.py'],
|
||||
pathex=[],
|
||||
binaries=binaries,
|
||||
datas=datas,
|
||||
hiddenimports=hiddenimports,
|
||||
hookspath=[],
|
||||
hooksconfig={},
|
||||
runtime_hooks=[],
|
||||
excludes=[],
|
||||
noarchive=False,
|
||||
optimize=0,
|
||||
)
|
||||
pyz = PYZ(a.pure)
|
||||
|
||||
# --- 3. EXE ERSTELLEN ---
|
||||
exe = EXE(
|
||||
pyz,
|
||||
a.scripts,
|
||||
[],
|
||||
exclude_binaries=True,
|
||||
name='UFF_Search', # Name der Datei (UFF_Search.exe)
|
||||
debug=False,
|
||||
bootloader_ignore_signals=False,
|
||||
strip=False,
|
||||
upx=True,
|
||||
console=False,
|
||||
disable_windowed_traceback=False,
|
||||
argv_emulation=False,
|
||||
target_arch=None,
|
||||
codesign_identity=None,
|
||||
entitlements_file=None,
|
||||
icon='assets\\favicon.ico', # Pfad zum Icon
|
||||
)
|
||||
|
||||
# --- 4. ORDNER ZUSAMMENSTELLEN ---
|
||||
coll = COLLECT(
|
||||
exe,
|
||||
a.binaries,
|
||||
a.datas,
|
||||
strip=False,
|
||||
upx=True,
|
||||
upx_exclude=[],
|
||||
name='UFF_Search', # Name des Ordners
|
||||
)
|
||||
@@ -1,3 +1,11 @@
|
||||
pypdf
|
||||
pdfplumber
|
||||
pdfminer.six
|
||||
rapidfuzz
|
||||
PyQt6
|
||||
sentence-transformers==2.2.2
|
||||
transformers==4.28.1
|
||||
torch==1.13.1
|
||||
numpy==1.24.2
|
||||
python-docx
|
||||
openpyxl
|
||||
python-pptx
|
||||
@@ -1,52 +0,0 @@
|
||||
; -- UFF-Search Installer Skript --
|
||||
|
||||
[Setup]
|
||||
; Der Name, der überall steht
|
||||
AppName=UFF Text Search
|
||||
AppVersion=3.0
|
||||
AppPublisher=Konstantin Roßmann
|
||||
AppPublisherURL=https://rossmann-it-solutions.de
|
||||
|
||||
; Wo soll standardmäßig installiert werden? {autopf} ist "Program Files"
|
||||
DefaultDirName={autopf}\UFF-Search
|
||||
; Name der Gruppe im Startmenü
|
||||
DefaultGroupName=UFF Search
|
||||
|
||||
; Speicherort der fertigen setup.exe (z.B. auf dem Desktop oder im Projektordner)
|
||||
OutputDir=.
|
||||
OutputBaseFilename=UFF_Search_Installer_v3
|
||||
Compression=lzma
|
||||
SolidCompression=yes
|
||||
|
||||
; Icon für den Installer selbst (optional, sonst weglassen)
|
||||
; SetupIconFile=app.ico
|
||||
|
||||
; Administrator-Rechte anfordern für Installation in Program Files
|
||||
PrivilegesRequired=admin
|
||||
|
||||
[Languages]
|
||||
Name: "german"; MessagesFile: "compiler:Languages\German.isl"
|
||||
|
||||
[Tasks]
|
||||
; Checkbox: "Desktop Verknüpfung erstellen"
|
||||
Name: "desktopicon"; Description: "{cm:CreateDesktopIcon}"; GroupDescription: "{cm:AdditionalIcons}"; Flags: unchecked
|
||||
|
||||
[Files]
|
||||
; !!! WICHTIG: HIER DEN PFAD ZU DEINER EXE ANPASSEN !!!
|
||||
; "Source" muss auf die Datei zeigen, die PyInstaller im "dist" Ordner erstellt hat.
|
||||
Source: "C:\Users\konst\Arbeit\unsorted-folder-full-text-search\dist\UFF-Search.exe"; DestDir: "{app}"; Flags: ignoreversion
|
||||
|
||||
; Falls du ein Icon mitliefern willst (optional)
|
||||
; Source: "C:\Pfad\Zu\Deinem\Projekt\app.ico"; DestDir: "{app}"; Flags: ignoreversion
|
||||
|
||||
[Icons]
|
||||
; Verknüpfung im Startmenü
|
||||
Name: "{group}\UFF Text Search"; Filename: "{app}\UFF-Search.exe"
|
||||
; Verknüpfung zum Deinstallieren
|
||||
Name: "{group}\Uninstall UFF Search"; Filename: "{uninstallexe}"
|
||||
; Verknüpfung auf dem Desktop (wenn vom User ausgewählt)
|
||||
Name: "{commondesktop}\UFF Text Search"; Filename: "{app}\UFF-Search.exe"; Tasks: desktopicon
|
||||
|
||||
[Run]
|
||||
; Checkbox am Ende: "Programm jetzt starten"
|
||||
Description: "{cm:LaunchProgram,UFF Text Search}"; Filename: "{app}\UFF-Search.exe"; Flags: nowait postinstall skipifsilent
|
||||
414
uff_app.py
414
uff_app.py
@@ -1,414 +0,0 @@
|
||||
import sys
|
||||
import os
|
||||
import sqlite3
|
||||
from pypdf import PdfReader
|
||||
|
||||
# NEU: Für die Fuzzy-Logik
|
||||
from rapidfuzz import process, fuzz
|
||||
|
||||
from PyQt6.QtWidgets import (QApplication, QMainWindow, QWidget, QVBoxLayout,
|
||||
QHBoxLayout, QLineEdit, QPushButton, QLabel,
|
||||
QFileDialog, QTextBrowser, QProgressBar, QMessageBox,
|
||||
QListWidget, QListWidgetItem, QSplitter, QFrame)
|
||||
from PyQt6.QtCore import Qt, QThread, pyqtSignal, QUrl
|
||||
from PyQt6.QtGui import QDesktopServices
|
||||
|
||||
# --- 1. DATENBANK MANAGER (Mit Fuzzy-Ranking) ---
|
||||
|
||||
class DatabaseHandler:
|
||||
def __init__(self):
|
||||
# 1. Wir ermitteln den korrekten AppData Ordner für den User
|
||||
# Windows: C:\Users\Name\AppData\Local\UFF_Search
|
||||
if os.name == 'nt':
|
||||
base_dir = os.getenv('LOCALAPPDATA')
|
||||
else:
|
||||
# Mac/Linux: ~/.local/share/uff_search
|
||||
base_dir = os.path.join(os.path.expanduser("~"), ".local", "share")
|
||||
|
||||
# 2. Wir erstellen unseren eigenen Unterordner
|
||||
self.app_data_dir = os.path.join(base_dir, "UFF_Search")
|
||||
|
||||
# Falls der Ordner nicht existiert, erstellen wir ihn
|
||||
if not os.path.exists(self.app_data_dir):
|
||||
os.makedirs(self.app_data_dir)
|
||||
|
||||
# 3. Der Pfad zur Datenbank
|
||||
self.db_name = os.path.join(self.app_data_dir, "uff_index.db")
|
||||
|
||||
# Debug-Info (falls du es im Terminal testest)
|
||||
print(f"Datenbank Pfad: {self.db_name}")
|
||||
|
||||
self.init_db()
|
||||
|
||||
def init_db(self):
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("""
|
||||
CREATE VIRTUAL TABLE IF NOT EXISTS documents
|
||||
USING fts5(filename, path, content);
|
||||
""")
|
||||
cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS folders (
|
||||
path TEXT PRIMARY KEY,
|
||||
alias TEXT
|
||||
);
|
||||
""")
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def add_folder(self, path):
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
try:
|
||||
conn.execute("INSERT OR IGNORE INTO folders (path, alias) VALUES (?, ?)", (path, os.path.basename(path)))
|
||||
conn.commit()
|
||||
return True
|
||||
except:
|
||||
return False
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def remove_folder(self, path):
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
conn.execute("DELETE FROM folders WHERE path = ?", (path,))
|
||||
conn.execute("DELETE FROM documents WHERE path LIKE ?", (f"{path}%",))
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def get_folders(self):
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
rows = conn.execute("SELECT path FROM folders").fetchall()
|
||||
conn.close()
|
||||
return [r[0] for r in rows]
|
||||
|
||||
def search(self, query):
|
||||
if not query.strip(): return []
|
||||
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
|
||||
# 1. Versuch: Strikte Datenbank-Suche (Schnell)
|
||||
words = query.replace('"', '').split()
|
||||
# Wir suchen nach "Wort*" -> findet Wortanfänge
|
||||
sql_query_parts = [f'"{w}"*' for w in words]
|
||||
sql_query_string = " OR ".join(sql_query_parts)
|
||||
|
||||
sql = """
|
||||
SELECT filename, path, snippet(documents, 2, '<b>', '</b>', '...', 15), content
|
||||
FROM documents
|
||||
WHERE documents MATCH ?
|
||||
LIMIT 200
|
||||
"""
|
||||
|
||||
try:
|
||||
rows = conn.execute(sql, (sql_query_string,)).fetchall()
|
||||
except:
|
||||
rows = []
|
||||
|
||||
# 2. Versuch (FALLBACK): Wenn DB nichts findet, laden wir ALLES
|
||||
# Das ist der "Panic Mode" für starke Tippfehler (wie "vertraaag")
|
||||
if len(rows) < 5:
|
||||
# Wir holen einfach mal die ersten 1000 Dokumente ohne Filter
|
||||
fallback_sql = """
|
||||
SELECT filename, path, snippet(documents, 2, '<b>', '</b>', '...', 15), content
|
||||
FROM documents
|
||||
LIMIT 1000
|
||||
"""
|
||||
rows = conn.execute(fallback_sql).fetchall()
|
||||
|
||||
conn.close()
|
||||
|
||||
# 3. Python Fuzzy Re-Ranking (RapidFuzz)
|
||||
scored_results = []
|
||||
|
||||
for filename, path, snippet, content in rows:
|
||||
# Wir berechnen Scores
|
||||
score_name = fuzz.partial_ratio(query.lower(), filename.lower())
|
||||
|
||||
# Content-Check: Wir nehmen Content (falls snippet zu kurz ist)
|
||||
# Begrenzung auf die ersten 5000 Zeichen für Performance
|
||||
check_content = content[:5000] if content else ""
|
||||
score_content = fuzz.partial_token_set_ratio(query.lower(), check_content.lower())
|
||||
|
||||
final_score = max(score_name, score_content)
|
||||
|
||||
# Bonus für exakte Wort-Treffer
|
||||
if all(w.lower() in (filename + check_content).lower() for w in words):
|
||||
final_score += 10
|
||||
|
||||
# Filter: Nur anzeigen, wenn Score halbwegs okay ist
|
||||
# Bei "vertraaag" vs "vertrag" ist der Score meist > 70
|
||||
if final_score > 55:
|
||||
scored_results.append({
|
||||
"score": final_score,
|
||||
"data": (filename, path, snippet)
|
||||
})
|
||||
|
||||
# 4. Sortieren
|
||||
scored_results.sort(key=lambda x: x["score"], reverse=True)
|
||||
|
||||
return [item["data"] for item in scored_results[:50]]
|
||||
|
||||
# --- 2. INDEXER (Unverändert) ---
|
||||
|
||||
class IndexerThread(QThread):
|
||||
progress_signal = pyqtSignal(str)
|
||||
finished_signal = pyqtSignal(int, int, bool)
|
||||
|
||||
def __init__(self, folder_path, db_name="uff_index.db"):
|
||||
super().__init__()
|
||||
self.folder_path = folder_path
|
||||
self.db_name = db_name
|
||||
self.is_running = True
|
||||
|
||||
def stop(self):
|
||||
self.is_running = False
|
||||
|
||||
def _extract_text(self, filepath):
|
||||
ext = os.path.splitext(filepath)[1].lower()
|
||||
try:
|
||||
if ext == ".pdf":
|
||||
reader = PdfReader(filepath)
|
||||
text = ""
|
||||
for page in reader.pages:
|
||||
if page_text := page.extract_text(): text += page_text + "\n"
|
||||
return text
|
||||
elif ext in [".txt", ".md", ".py", ".json", ".csv", ".html", ".log", ".ini", ".xml"]:
|
||||
with open(filepath, "r", encoding="utf-8", errors="ignore") as f:
|
||||
return f.read()
|
||||
return None
|
||||
except:
|
||||
return None
|
||||
|
||||
def run(self):
|
||||
conn = sqlite3.connect(self.db_name)
|
||||
conn.execute("DELETE FROM documents WHERE path LIKE ?", (f"{self.folder_path}%",))
|
||||
conn.commit()
|
||||
|
||||
indexed = 0
|
||||
skipped = 0
|
||||
was_cancelled = False
|
||||
|
||||
for root, dirs, files in os.walk(self.folder_path):
|
||||
if not self.is_running:
|
||||
was_cancelled = True
|
||||
break
|
||||
for file in files:
|
||||
if not self.is_running:
|
||||
was_cancelled = True
|
||||
break
|
||||
|
||||
self.progress_signal.emit(f"Lese: {file}...")
|
||||
path = os.path.join(root, file)
|
||||
content = self._extract_text(path)
|
||||
|
||||
if content and len(content.strip()) > 0:
|
||||
conn.execute(
|
||||
"INSERT INTO documents (filename, path, content) VALUES (?, ?, ?)",
|
||||
(file, path, content)
|
||||
)
|
||||
indexed += 1
|
||||
else:
|
||||
skipped += 1
|
||||
if was_cancelled: break
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
self.finished_signal.emit(indexed, skipped, was_cancelled)
|
||||
|
||||
# --- 3. UI (Unverändert) ---
|
||||
|
||||
class UffWindow(QMainWindow):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.db = DatabaseHandler()
|
||||
self.indexer_thread = None
|
||||
self.initUI()
|
||||
self.load_saved_folders()
|
||||
|
||||
def initUI(self):
|
||||
self.setWindowTitle("UFF Text Search v3.0 (Fuzzy)")
|
||||
self.resize(1000, 700)
|
||||
|
||||
central = QWidget()
|
||||
self.setCentralWidget(central)
|
||||
main_layout = QHBoxLayout(central)
|
||||
|
||||
# LINKS
|
||||
left_panel = QFrame()
|
||||
left_panel.setFixedWidth(250)
|
||||
left_layout = QVBoxLayout(left_panel)
|
||||
left_layout.setContentsMargins(0, 0, 0, 0)
|
||||
|
||||
lbl_folders = QLabel("📂 Meine Ordner")
|
||||
lbl_folders.setStyleSheet("font-weight: bold; font-size: 14px;")
|
||||
|
||||
self.folder_list = QListWidget()
|
||||
self.folder_list.setSelectionMode(QListWidget.SelectionMode.SingleSelection)
|
||||
|
||||
btn_add = QPushButton(" + Hinzufügen")
|
||||
btn_add.clicked.connect(self.add_new_folder)
|
||||
|
||||
btn_remove = QPushButton(" - Entfernen")
|
||||
btn_remove.clicked.connect(self.delete_selected_folder)
|
||||
|
||||
self.btn_rescan = QPushButton(" ↻ Neu scannen")
|
||||
self.btn_rescan.clicked.connect(self.rescan_selected_folder)
|
||||
|
||||
self.btn_cancel = QPushButton("🛑 Abbrechen")
|
||||
self.btn_cancel.setStyleSheet("background-color: #ffcccc; color: #cc0000; font-weight: bold;")
|
||||
self.btn_cancel.clicked.connect(self.cancel_indexing)
|
||||
self.btn_cancel.hide()
|
||||
|
||||
left_layout.addWidget(lbl_folders)
|
||||
left_layout.addWidget(self.folder_list)
|
||||
left_layout.addWidget(btn_add)
|
||||
left_layout.addWidget(btn_remove)
|
||||
left_layout.addStretch()
|
||||
left_layout.addWidget(self.btn_rescan)
|
||||
left_layout.addWidget(self.btn_cancel)
|
||||
|
||||
# RECHTS
|
||||
right_panel = QWidget()
|
||||
right_layout = QVBoxLayout(right_panel)
|
||||
|
||||
search_container = QHBoxLayout()
|
||||
self.input_search = QLineEdit()
|
||||
self.input_search.setPlaceholderText("Suchbegriff... (Fuzzy aktiv)")
|
||||
self.input_search.returnPressed.connect(self.perform_search)
|
||||
self.input_search.setStyleSheet("padding: 8px; font-size: 14px;")
|
||||
|
||||
btn_go = QPushButton("Suchen")
|
||||
btn_go.setFixedWidth(100)
|
||||
btn_go.clicked.connect(self.perform_search)
|
||||
|
||||
search_container.addWidget(self.input_search)
|
||||
search_container.addWidget(btn_go)
|
||||
|
||||
self.lbl_status = QLabel("Bereit.")
|
||||
self.lbl_status.setStyleSheet("color: #666;")
|
||||
self.progress_bar = QProgressBar()
|
||||
self.progress_bar.hide()
|
||||
|
||||
self.result_browser = QTextBrowser()
|
||||
self.result_browser.setOpenExternalLinks(False)
|
||||
self.result_browser.anchorClicked.connect(self.link_clicked)
|
||||
self.result_browser.setStyleSheet("background-color: white; border: 1px solid #ccc;")
|
||||
|
||||
right_layout.addLayout(search_container)
|
||||
right_layout.addWidget(self.lbl_status)
|
||||
right_layout.addWidget(self.progress_bar)
|
||||
right_layout.addWidget(self.result_browser)
|
||||
|
||||
splitter = QSplitter(Qt.Orientation.Horizontal)
|
||||
splitter.addWidget(left_panel)
|
||||
splitter.addWidget(right_panel)
|
||||
splitter.setSizes([250, 750])
|
||||
|
||||
main_layout.addWidget(splitter)
|
||||
|
||||
# LOGIK
|
||||
def load_saved_folders(self):
|
||||
self.folder_list.clear()
|
||||
folders = self.db.get_folders()
|
||||
for f in folders:
|
||||
item = QListWidgetItem(f)
|
||||
item.setToolTip(f)
|
||||
self.folder_list.addItem(item)
|
||||
|
||||
def add_new_folder(self):
|
||||
folder = QFileDialog.getExistingDirectory(self, "Ordner wählen")
|
||||
if folder:
|
||||
if self.db.add_folder(folder):
|
||||
self.load_saved_folders()
|
||||
self.start_indexing(folder)
|
||||
else:
|
||||
QMessageBox.warning(self, "Info", "Ordner ist bereits vorhanden.")
|
||||
|
||||
def delete_selected_folder(self):
|
||||
item = self.folder_list.currentItem()
|
||||
if not item: return
|
||||
path = item.text()
|
||||
if QMessageBox.question(self, "Löschen", f"Ordner entfernen?\n{path}",
|
||||
QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No) == QMessageBox.StandardButton.Yes:
|
||||
self.db.remove_folder(path)
|
||||
self.load_saved_folders()
|
||||
self.result_browser.clear()
|
||||
self.lbl_status.setText("Ordner entfernt.")
|
||||
|
||||
def rescan_selected_folder(self):
|
||||
item = self.folder_list.currentItem()
|
||||
if not item:
|
||||
QMessageBox.information(self, "Info", "Bitte Ordner links auswählen.")
|
||||
return
|
||||
self.start_indexing(item.text())
|
||||
|
||||
def start_indexing(self, folder):
|
||||
self.set_ui_busy(True)
|
||||
self.lbl_status.setText(f"Starte... {os.path.basename(folder)}")
|
||||
|
||||
# HIER WAR DER FEHLER:
|
||||
# Wir müssen dem Thread explizit sagen, wo die Datenbank liegt!
|
||||
# self.db.db_name enthält den korrekten Pfad (C:\Users\...\AppData\...)
|
||||
self.indexer_thread = IndexerThread(folder, db_name=self.db.db_name)
|
||||
|
||||
self.indexer_thread.progress_signal.connect(lambda msg: self.lbl_status.setText(msg))
|
||||
self.indexer_thread.finished_signal.connect(self.indexing_finished)
|
||||
self.indexer_thread.start()
|
||||
|
||||
def cancel_indexing(self):
|
||||
if self.indexer_thread and self.indexer_thread.isRunning():
|
||||
self.lbl_status.setText("Breche ab...")
|
||||
self.indexer_thread.stop()
|
||||
|
||||
def indexing_finished(self, indexed, skipped, was_cancelled):
|
||||
self.set_ui_busy(False)
|
||||
if was_cancelled:
|
||||
self.lbl_status.setText(f"Abgebrochen. ({indexed} indiziert).")
|
||||
QMessageBox.information(self, "Abbruch", f"Vorgang abgebrochen.\nBis dahin indiziert: {indexed}")
|
||||
else:
|
||||
self.lbl_status.setText(f"Fertig. {indexed} neu, {skipped} übersprungen.")
|
||||
QMessageBox.information(self, "Fertig", f"Scan abgeschlossen!\n{indexed} Dateien im Index.")
|
||||
|
||||
def set_ui_busy(self, busy):
|
||||
self.input_search.setEnabled(not busy)
|
||||
self.folder_list.setEnabled(not busy)
|
||||
self.btn_rescan.setVisible(not busy)
|
||||
self.btn_cancel.setVisible(busy)
|
||||
if busy:
|
||||
self.progress_bar.setRange(0, 0)
|
||||
self.progress_bar.show()
|
||||
else:
|
||||
self.progress_bar.hide()
|
||||
|
||||
def perform_search(self):
|
||||
query = self.input_search.text()
|
||||
if not query: return
|
||||
|
||||
# Suche ausführen (jetzt mit Fuzzy!)
|
||||
results = self.db.search(query)
|
||||
self.lbl_status.setText(f"{len(results)} relevante Treffer.")
|
||||
|
||||
html = ""
|
||||
if not results:
|
||||
html = "<h3 style='color: gray; text-align: center; margin-top: 20px;'>Nichts gefunden.</h3>"
|
||||
|
||||
for filename, filepath, snippet in results:
|
||||
file_url = QUrl.fromLocalFile(filepath).toString()
|
||||
html += f"""
|
||||
<div style='margin-bottom: 10px; padding: 10px; background-color: #f9f9f9; border-left: 4px solid #2980b9;'>
|
||||
<a href="{file_url}" style='font-size: 16px; font-weight: bold; color: #2980b9; text-decoration: none;'>
|
||||
{filename}
|
||||
</a>
|
||||
<div style='color: #333; margin-top: 5px; font-family: sans-serif; font-size: 13px;'>{snippet}</div>
|
||||
<div style='color: #999; font-size: 11px; margin-top: 4px;'>{filepath}</div>
|
||||
</div>
|
||||
"""
|
||||
self.result_browser.setHtml(html)
|
||||
|
||||
def link_clicked(self, url):
|
||||
QDesktopServices.openUrl(url)
|
||||
|
||||
if __name__ == "__main__":
|
||||
app = QApplication(sys.argv)
|
||||
window = UffWindow()
|
||||
window.show()
|
||||
sys.exit(app.exec())
|
||||
316
ui.py
Normal file
316
ui.py
Normal file
@@ -0,0 +1,316 @@
|
||||
# ui.py
|
||||
import os
|
||||
from PyQt6.QtWidgets import (QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout,
|
||||
QLineEdit, QPushButton, QLabel, QFileDialog,
|
||||
QProgressBar, QMessageBox, QListWidget, QListWidgetItem,
|
||||
QSplitter, QFrame, QScrollArea, QStyle, QGraphicsDropShadowEffect,
|
||||
QSplashScreen) # QSplashScreen hier wichtig
|
||||
from PyQt6.QtCore import Qt, QUrl, QThread, pyqtSignal, QRect
|
||||
from PyQt6.QtGui import QDesktopServices, QColor, QFont, QPainter, QIcon, QPixmap # Painter & Icon neu
|
||||
from sentence_transformers import SentenceTransformer
|
||||
|
||||
from database import DatabaseHandler
|
||||
from indexer import IndexerThread
|
||||
from config import STYLESHEET
|
||||
|
||||
# --- NEU: Ein moderner Splash Screen mit Ladebalken ---
|
||||
class ModernSplashScreen(QSplashScreen):
|
||||
def __init__(self, pixmap):
|
||||
super().__init__(pixmap)
|
||||
self.progress = 0
|
||||
self.message = "Initialisiere..."
|
||||
# Schriftart für den Ladetext
|
||||
self.font = QFont("Segoe UI", 10, QFont.Weight.Bold)
|
||||
|
||||
def set_progress(self, value, text):
|
||||
self.progress = value
|
||||
self.message = text
|
||||
self.repaint() # Erzwingt neuzeichnen
|
||||
|
||||
def drawContents(self, painter):
|
||||
# 1. Das normale Bild zeichnen
|
||||
super().drawContents(painter)
|
||||
|
||||
# 2. Ladebalken-Hintergrund (unten)
|
||||
# Wir malen direkt auf das Bild
|
||||
bg_rect = self.rect()
|
||||
bar_height = 20
|
||||
# Position: Ganz unten am Bild
|
||||
bar_rect = QRect(0, bg_rect.height() - bar_height, bg_rect.width(), bar_height)
|
||||
|
||||
# Hintergrund des Balkens (dunkelgrau)
|
||||
painter.setPen(Qt.PenStyle.NoPen)
|
||||
painter.setBrush(QColor(50, 50, 50))
|
||||
painter.drawRect(bar_rect)
|
||||
|
||||
# 3. Der Fortschritt (türkis/blau)
|
||||
# Breite basierend auf % berechnen
|
||||
progress_width = int(bg_rect.width() * (self.progress / 100))
|
||||
prog_rect = QRect(0, bg_rect.height() - bar_height, progress_width, bar_height)
|
||||
|
||||
painter.setBrush(QColor("#3498db")) # UFF-Blau
|
||||
painter.drawRect(prog_rect)
|
||||
|
||||
# 4. Text zeichnen (zentriert über dem Balken oder darin)
|
||||
painter.setPen(QColor("white"))
|
||||
painter.setFont(self.font)
|
||||
# Text etwas oberhalb des Balkens zeichnen
|
||||
text_rect = QRect(0, bg_rect.height() - bar_height - 30, bg_rect.width(), 25)
|
||||
painter.drawText(text_rect, Qt.AlignmentFlag.AlignCenter, self.message)
|
||||
|
||||
# --- Thread zum Laden des Modells ---
|
||||
class ModelLoaderThread(QThread):
|
||||
model_loaded = pyqtSignal(object)
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
# Das ist der schwere Teil, der dauert
|
||||
model = SentenceTransformer('all-MiniLM-L6-v2')
|
||||
self.model_loaded.emit(model)
|
||||
except:
|
||||
self.model_loaded.emit(None)
|
||||
|
||||
# --- SearchResultItem (Unverändert, aber der Vollständigkeit halber hier) ---
|
||||
class SearchResultItem(QFrame):
|
||||
def __init__(self, filename, filepath, snippet, parent=None):
|
||||
super().__init__(parent)
|
||||
self.filepath = filepath
|
||||
self.setToolTip(filepath)
|
||||
|
||||
self.setFrameShape(QFrame.Shape.StyledPanel)
|
||||
self.setStyleSheet("""
|
||||
SearchResultItem { background-color: white; border: 1px solid #e0e0e0; border-radius: 8px; }
|
||||
SearchResultItem:hover { border: 1px solid #3498db; background-color: #fbfbfb; }
|
||||
""")
|
||||
|
||||
shadow = QGraphicsDropShadowEffect(self)
|
||||
shadow.setBlurRadius(10)
|
||||
shadow.setXOffset(0)
|
||||
shadow.setYOffset(2)
|
||||
shadow.setColor(QColor(0, 0, 0, 30))
|
||||
self.setGraphicsEffect(shadow)
|
||||
|
||||
layout = QVBoxLayout(self)
|
||||
layout.setContentsMargins(15, 15, 15, 15)
|
||||
layout.setSpacing(5)
|
||||
|
||||
self.btn_title = QPushButton(filename)
|
||||
self.btn_title.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||
self.btn_title.setMouseTracking(True)
|
||||
self.btn_title.setStyleSheet("""
|
||||
QPushButton { text-align: left; font-weight: bold; font-size: 16px; color: #2c3e50; border: none; background: transparent; padding: 0px; }
|
||||
QPushButton:hover { color: #3498db; text-decoration: underline; }
|
||||
""")
|
||||
self.btn_title.clicked.connect(self.open_file)
|
||||
|
||||
self.lbl_snippet = QLabel(snippet)
|
||||
self.lbl_snippet.setWordWrap(True)
|
||||
self.lbl_snippet.setStyleSheet("color: #555; font-size: 13px; line-height: 1.4;")
|
||||
|
||||
path_layout = QHBoxLayout()
|
||||
lbl_icon = QLabel("📄")
|
||||
lbl_icon.setStyleSheet("font-size: 10px; color: #95a5a6;")
|
||||
|
||||
self.lbl_path = QLabel(filepath)
|
||||
self.lbl_path.setStyleSheet("color: #95a5a6; font-size: 11px;")
|
||||
|
||||
path_layout.addWidget(lbl_icon)
|
||||
path_layout.addWidget(self.lbl_path)
|
||||
path_layout.addStretch()
|
||||
|
||||
layout.addWidget(self.btn_title)
|
||||
layout.addWidget(self.lbl_snippet)
|
||||
layout.addLayout(path_layout)
|
||||
|
||||
def open_file(self):
|
||||
target = self.filepath.split(" :: ")[0] if " :: " in self.filepath else self.filepath
|
||||
QDesktopServices.openUrl(QUrl.fromLocalFile(target))
|
||||
|
||||
# --- Das Hauptfenster ---
|
||||
class UffWindow(QMainWindow):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.db = DatabaseHandler()
|
||||
self.initUI()
|
||||
|
||||
|
||||
self.load_saved_folders()
|
||||
|
||||
def initUI(self):
|
||||
self.setWindowTitle("UFF Search v1.0")
|
||||
self.resize(1100, 750)
|
||||
self.setStyleSheet(STYLESHEET)
|
||||
|
||||
central = QWidget()
|
||||
self.setCentralWidget(central)
|
||||
main_layout = QHBoxLayout(central)
|
||||
main_layout.setContentsMargins(0, 0, 0, 0)
|
||||
main_layout.setSpacing(0)
|
||||
|
||||
# -- SIDEBAR --
|
||||
left_panel = QFrame()
|
||||
left_panel.setObjectName("Sidebar")
|
||||
left_panel.setFixedWidth(260)
|
||||
left = QVBoxLayout(left_panel)
|
||||
left.setContentsMargins(0, 20, 0, 20)
|
||||
|
||||
lbl_title = QLabel(" UFF SEARCH")
|
||||
lbl_title.setObjectName("SidebarTitle")
|
||||
|
||||
self.folder_list = QListWidget()
|
||||
|
||||
btn_add = QPushButton(" Ordner hinzufügen")
|
||||
btn_add.setObjectName("SidebarBtn")
|
||||
btn_add.setIcon(self.style().standardIcon(QStyle.StandardPixmap.SP_FileDialogNewFolder))
|
||||
btn_add.clicked.connect(self.add_new_folder)
|
||||
|
||||
btn_del = QPushButton(" Ordner entfernen")
|
||||
btn_del.setObjectName("SidebarBtn")
|
||||
btn_del.setIcon(self.style().standardIcon(QStyle.StandardPixmap.SP_TrashIcon))
|
||||
btn_del.clicked.connect(self.delete_selected_folder)
|
||||
|
||||
self.btn_rescan = QPushButton(" Neu scannen")
|
||||
self.btn_rescan.setObjectName("SidebarBtn")
|
||||
self.btn_rescan.setIcon(self.style().standardIcon(QStyle.StandardPixmap.SP_BrowserReload))
|
||||
self.btn_rescan.clicked.connect(self.rescan)
|
||||
|
||||
self.btn_cancel = QPushButton("STOPPEN")
|
||||
self.btn_cancel.setObjectName("CancelBtn")
|
||||
self.btn_cancel.setIcon(self.style().standardIcon(QStyle.StandardPixmap.SP_DialogCancelButton))
|
||||
self.btn_cancel.clicked.connect(self.cancel_idx)
|
||||
self.btn_cancel.hide()
|
||||
|
||||
left.addWidget(lbl_title)
|
||||
left.addSpacing(10)
|
||||
left.addWidget(self.folder_list)
|
||||
left.addSpacing(10)
|
||||
left.addWidget(btn_add)
|
||||
left.addWidget(btn_del)
|
||||
left.addWidget(self.btn_rescan)
|
||||
left.addWidget(self.btn_cancel)
|
||||
|
||||
# -- MAIN AREA --
|
||||
right_panel = QWidget()
|
||||
right_panel.setObjectName("MainArea")
|
||||
right = QVBoxLayout(right_panel)
|
||||
right.setContentsMargins(30, 30, 30, 30)
|
||||
right.setSpacing(15)
|
||||
|
||||
search_box = QHBoxLayout()
|
||||
self.input = QLineEdit()
|
||||
self.input.setPlaceholderText("Wonach suchst du heute?")
|
||||
self.input.returnPressed.connect(self.search)
|
||||
|
||||
self.btn_go = QPushButton("Suchen")
|
||||
self.btn_go.setObjectName("SearchBtn")
|
||||
self.btn_go.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||
self.btn_go.clicked.connect(self.search)
|
||||
|
||||
search_box.addWidget(self.input)
|
||||
search_box.addWidget(self.btn_go)
|
||||
|
||||
status_box = QHBoxLayout()
|
||||
self.lbl_status = QLabel("Bereit.")
|
||||
self.lbl_status.setObjectName("StatusLabel")
|
||||
self.prog = QProgressBar()
|
||||
self.prog.hide()
|
||||
status_box.addWidget(self.lbl_status)
|
||||
status_box.addWidget(self.prog)
|
||||
|
||||
self.scroll = QScrollArea()
|
||||
self.scroll.setWidgetResizable(True)
|
||||
self.res_cont = QWidget()
|
||||
self.res_cont.setObjectName("ResultsContainer")
|
||||
self.res_layout = QVBoxLayout(self.res_cont)
|
||||
self.res_layout.setAlignment(Qt.AlignmentFlag.AlignTop)
|
||||
self.res_layout.setSpacing(15)
|
||||
self.scroll.setWidget(self.res_cont)
|
||||
|
||||
right.addLayout(search_box)
|
||||
right.addLayout(status_box)
|
||||
right.addWidget(self.scroll)
|
||||
|
||||
main_layout.addWidget(left_panel)
|
||||
main_layout.addWidget(right_panel)
|
||||
self.set_ui_enabled(False)
|
||||
|
||||
def set_ui_enabled(self, enabled):
|
||||
self.input.setEnabled(enabled)
|
||||
self.btn_go.setEnabled(enabled)
|
||||
self.folder_list.setEnabled(enabled)
|
||||
|
||||
# Methoden für Model Loading (wird jetzt von main gesteuert)
|
||||
def on_model_loaded(self, model):
|
||||
if not model:
|
||||
QMessageBox.critical(self, "Fehler", "Modell konnte nicht geladen werden.")
|
||||
return
|
||||
self.db.model = model
|
||||
self.lbl_status.setText("Bereit für deine Suche.")
|
||||
self.set_ui_enabled(True)
|
||||
|
||||
# ... RESTLICHE METHODEN (search, add_folder etc.) bleiben gleich wie vorher ...
|
||||
# (Kopiere hier einfach die Methoden aus deiner alten ui.py rein,
|
||||
# search, load_saved_folders, add_new_folder, delete_selected_folder, rescan, start_idx, cancel_idx, idx_done)
|
||||
|
||||
def search(self):
|
||||
query = self.input.text()
|
||||
if not query: return
|
||||
self.lbl_status.setText("Suche läuft...")
|
||||
QApplication.processEvents()
|
||||
|
||||
while self.res_layout.count():
|
||||
child = self.res_layout.takeAt(0)
|
||||
if child.widget(): child.widget().deleteLater()
|
||||
|
||||
results = self.db.search(query)
|
||||
self.lbl_status.setText(f"{len(results)} Treffer gefunden.")
|
||||
|
||||
if not results:
|
||||
lbl = QLabel("Leider keine Ergebnisse.")
|
||||
lbl.setStyleSheet("color: #95a5a6; font-size: 18px; margin-top: 40px;")
|
||||
lbl.setAlignment(Qt.AlignmentFlag.AlignHCenter)
|
||||
self.res_layout.addWidget(lbl)
|
||||
else:
|
||||
for fname, fpath, snippet in results:
|
||||
self.res_layout.addWidget(SearchResultItem(fname, fpath, snippet))
|
||||
self.res_layout.addStretch()
|
||||
|
||||
def load_saved_folders(self):
|
||||
self.folder_list.clear()
|
||||
for f in self.db.get_folders():
|
||||
item = QListWidgetItem(self.style().standardIcon(QStyle.StandardPixmap.SP_DirIcon), f)
|
||||
item.setToolTip(f)
|
||||
self.folder_list.addItem(item)
|
||||
|
||||
def add_new_folder(self):
|
||||
f = QFileDialog.getExistingDirectory(self, "Ordner wählen")
|
||||
if f and self.db.add_folder(f):
|
||||
self.load_saved_folders()
|
||||
self.start_idx(f)
|
||||
|
||||
def delete_selected_folder(self):
|
||||
item = self.folder_list.currentItem()
|
||||
if item and QMessageBox.question(self, "Löschen", f"Weg damit?\n{item.text()}", QMessageBox.StandardButton.Yes|QMessageBox.StandardButton.No) == QMessageBox.StandardButton.Yes:
|
||||
self.db.remove_folder(item.text())
|
||||
self.load_saved_folders()
|
||||
|
||||
def rescan(self):
|
||||
if item := self.folder_list.currentItem(): self.start_idx(item.text())
|
||||
|
||||
def start_idx(self, folder):
|
||||
if not self.db.model: return
|
||||
self.set_ui_enabled(False)
|
||||
self.btn_cancel.show(); self.btn_rescan.hide(); self.prog.show()
|
||||
self.idx_thread = IndexerThread(folder, self.db.db_name, self.db.model)
|
||||
self.idx_thread.progress_signal.connect(self.lbl_status.setText)
|
||||
self.idx_thread.finished_signal.connect(self.idx_done)
|
||||
self.idx_thread.start()
|
||||
|
||||
def cancel_idx(self):
|
||||
if self.idx_thread: self.idx_thread.stop()
|
||||
|
||||
def idx_done(self, n, s, c):
|
||||
self.set_ui_enabled(True)
|
||||
self.btn_cancel.hide(); self.btn_rescan.show(); self.prog.hide()
|
||||
msg = "Abgebrochen" if c else "Indexierung fertig"
|
||||
self.lbl_status.setText(f"{msg}: {n} neu, {s} übersprungen.")
|
||||
Reference in New Issue
Block a user