Loading...
 

History: Search within files

Preview of version: 27

This page should be merged with Check file indexing


Starting in Tiki8, many handlers are included in the code


Starting in Tiki10, Tiki can natively index .docx, .xlsx and .pptx files. So you no longer need to install anything. It will even work on shared hosting!


Starting in Tiki18, Tiki can natively index .pdf files

Search within files


It is possible, once you enabled "Automatic indexing of file content" (Control Panels, File Galleries, Indexing tab ), to index the content of the files which are in the File Gallery so they can be retreived by a search. If you have a script that extracts the file content into a text, you can associate the script to the Mime type and the files content will be indexed.

If you want to search on files in the file galleries, you must provide handlers to extract the text for the file's MIME type (some may still work by default). The commands, such as strings or pdftotext must exist on your server. The type-command associations are defined in the Indexing tab of the Admin: File Gallery page.

To check support on your server, use Tiki Check.

MIME Type System command Ubuntu/Debian package with command
application/vnd.oasis.opendocument.presentation odt2txt %1 odt2txt
application/vnd.oasis.opendocument.spreadsheet odt2txt %1 odt2txt
application/vnd.oasis.opendocument.text odt2txt %1 odt2txt
application/vnd.openxmlformats-officedocument.wordprocessingml.document docx2txt.pl %1 -
application/ms-excel xls2csv %1 catdoc
application/ms-powerpoint catppt %1 catdoc
application/msword catdoc %1
or
strings %1
catdoc
application/pdf pstotext %1
or
pdftotext %1 -
poppler-utils or pstotext
application/postscript pstotext %1 pstotext
application/ps pstotext %1 pstotext
application/rtf catdoc %1 catdoc
application/sgml col -b %1
or
strings %1
bsdmainutils
application/vnd.ms-excel xls2csv %1 catdoc
application/vnd.ms-powerpoint catppt %1 catdoc
application/x-msexcel xls2csv %1 catdoc
application/x-pdf pstotext %1 poppler-utils or pstotext
application/x-troff-man man -l %1 man-db
text/enriched col -b %1
or
strings %1
bsdmainutils
text/html elinks -dump -no-home %1 elinks
text/plain col -b %1
or
strings %1
bsdmainutils
text/richtext col -b %1
or
strings %1
bsdmainutils
text/sgml col -b %1
or
strings %1
bsdmainutils
text/tab-separated-values col -b %1
or
strings %1
bsdmainutils



Several tools can be used to extract search strings; many Unix sites have "strings", which can detect things which appear to be text within files although without the accuracy of more specialized tools.

Ensure that the system command entered prints its output to the screen (standard output) and not to a file. Try the command on a console and check the manual. E.g. you have to add a trailing "-" to pdftotext.

It might be needed to clear the Tiki Cache after installing a new handler for the system to pick it up.

It's better if you have fileinfo installed to avoid misidentified mimetypes (install php-pear if you are using php < 5.3).

To install all required packages in a Debian-based server, you can use this command:

Copy to clipboard
sudo apt-get install bsdmainutils catdoc elinks man-db odt2txt php-pear pstotext


To install required packages on ClearOS:
http://wikisuite.org/How-to-configure-ClearOS-to-permit-Tiki-Wiki-CMS-Groupware-to-search-within-files

Related:

History

Advanced
Information Version
Marc Laporte 29
Bernard Sfez / Tiki Specialist 28
Bernard Sfez / Tiki Specialist Adding information to the doc 27
Marc Laporte 26
Marc Laporte 25
Marc Laporte poppler-utils is commonly available 24
Marc Laporte ClearOS instructions are on Tiki Suite site (will later likely live on ClearFoundation wiki...) 23
Marc Laporte 22
Marc Laporte 21
Marc Laporte 20
sylvie 19
Xavier de Pedro 18
Xavier de Pedro 17
Marc Laporte 16
Xavier de Pedro added basic info for OOo documents with odt2txt 15
Xavier de Pedro added to the documentation toc structure 14
Nelson Ko 13
Rick Sapir / Tiki for Smarties 12
Rick Sapir / Tiki for Smarties 11
Marc Laporte 10
Marc Laporte 9
Marc Laporte Merging rest from File+Gallery+Config. Most, if not all of this content is from Sylvie Gréverend. Thank you Sylvie! 8
Marc Laporte re introducing %%% or %%% to make it visually clear that you put just one 7
Marc Laporte copy-pasting from File+Gallery+Config to catch any differences 6
Marc Laporte 5
Marc Laporte 4
Marc Laporte clarify the "or" 3
Marc Laporte From "Search Admin" 1

doc.tiki.org

Get Started

Admin Guide User Guide

Keywords

Keywords serve as "hubs" for navigation within the Tiki documentation. They correspond to development keywords (bug reports and feature requests):

Accessibility (WAI and 508)
Accounting
Articles and Submissions
Backlinks
Banners
Batch
BigBlueButton audio/video/chat/screensharing
Blog
Bookmark
Browser Compatibility
Link Cache
Calendar
Category
Chat
Clean URLs
Comments
Communication Center
Compression (gzip)
Contacts (Address Book)
Contact us
Content Templates
Contribution
Cookie
Copyright
Credit
Custom Home and Group Home Page
Date and Time
Debugger Console
Directory of hyperlinks
Documentation link from Tiki to doc.tiki.org (Help System)
Docs
Draw
Dynamic Content
Dynamic Variable
External Authentication
FAQ
Featured links
File Gallery
Forum
Friendship Network (Community)
Gmap Google maps
Groups
Hotword
HTML Page
i18n (Multilingual, l10n)
Image Gallery
Import-Export
Install
Integrator
Interoperability
Inter-User Messages
InterTiki
Kaltura video management
Karma
Live Support
Login
Logs (system & action)
Look and Feel
Mail-in
Map with Mapserver
Menu
Meta Elements
Mobile Tiki and Voice Tiki
Module
MultiTiki
MyTiki
Newsletter
Notepad
Payment
Performance Speed / Load
Permissions
Platform independence (Linux-Apache, Windows/IIS, Mac, BSD)
Polls
Profiles
Profile Manager
Report
Toolbar
Quiz
Rating
Feeds
Score
Search engine optimization
Search
Search and Replace
Security
Semantic links
Shadowbox
Shadow Layers
Share
Shopping cart
Shoutbox
Slideshow
Smiley
Social Networks
Spam protection (Anti-bot CATPCHA)
Spellcheck
Spreadsheet
Stats
Surveys
Tags
Task
Tell a Friend, alert + Social Bookmarking
TikiTests
Theme CSS & Smarty
Tiki Manager
Trackers
Transitions
User Administration including registration and banning
User Files
User Menu
Watch
WebDAV
Webmail
Web Services
Wiki History, page rename, etc
Wiki Syntax
Wiki structure (book and table of content)
Workspace
WSOD
WYSIWYCA
WYSIWYG
XMLRPC

Tiki Newsletter

Delivered fresh to your email inbox!
Newsletter subscribe icon
Don't miss major announcements and other news!
Contribute to Tiki