merged search_api_solr submodule

This commit is contained in:
Bachir Soussi Chiadmi 2015-04-19 15:57:04 +02:00
commit 178cfe05d0
49 changed files with 12770 additions and 0 deletions

View File

@ -0,0 +1,178 @@
Search API Solr search 1.4 (12/25/2013):
----------------------------------------
- #2157839 by drunken monkey, Nick_vh: Updated config files to the newest
version.
- #2130827 by drunken monkey: Added additional Solr server information to the
Server overview.
- #2126281 by drunken monkey: Update error handling according to the latest
Search API change.
- #2127991 by drunken monkey: Fixed handling of negated fulltext keys.
- #2113943 by drunken monkey: Fixed clash in specifying the HTTP method for
searches.
- #2127193 by jlapp: Fixed date field values returned for multi-index searches.
- #2122155 drunken monkey: Added the "Files" tab to contextual links.
- #1846860 by andrewbelcher, drclaw, drunken monkey, danielnolde: Added a way
to easily define new dynamic field types.
- #2064377 by Nick_vh: Made configuration files compatible with Solr Cloud.
- #2107417 by Nick_vh: Fixed config files for Solr 4.5.
Search API Solr search 1.3 (10/23/2013):
----------------------------------------
- #2099683 by drunken monkey: Added support for 'virtual fields' in Views.
- #1997702 by ianthomas_uk, drunken monkey: Added "AUTO" mode for HTTP method.
- #2033913 by drunken monkey: Fixed small error in schema.xml.
- #2073441 by drunken monkey: Removed custom uninstall code for deleting
dependent servers.
- #1882190 by corvus_ch, arnested, drunken monkey: Added optional index ID
prefixes.
Search API Solr search 1.2 (09/01/2013):
----------------------------------------
- #1246730 by febbraro, maciej.zgadzaj, drunken monkey: Added a way to alter
the Solr document when indexing.
- #2053553 by drunken monkey, andrewbelcher: Fixed spatial features with clean
field identifiers.
- #2054373 by drunken monkey: Added the option to use clean field identifiers.
- #1992806 by drunken monkey: Documented problems with Solr 4.3+.
- #2045355 by drunken monkey, arpieb: Fixed result mapping of item IDs.
- #2050157 by izus: Fixed typo in stopwords.txt.
Search API Solr search 1.1 (07/21/2013):
----------------------------------------
- #1957730 by drunken monkey: Fixed filter query strings for negated filters.
- #2010818 by kenorb, drunken monkey: Added new Files tab showing all used solr
config files.
- #2042201 by klausi: Fixed timeouts while optimizing Solr server.
- #2034719 by fago: Added raw term to autocompletion response.
- #2027843 by fago, drunken monkey: Made the Solr response available as part of
the search results.
- #1834614 by drunken monkey: Fixed date fields in MLT queries.
- #1970652 by jsteggink: Fixed highlighting for text fields.
- #2016169 by tomdearden, drunken monkey: Fixed parsing of facets on
multi-valued fields.
- #2008034 by bdecarne: Fixed highlighting in multi-index searches.
Search API Solr search 1.0 (06/09/2013):
----------------------------------------
- #1896080 by drunken monkey: Included additional required config files in the
module.
- #1919572 by chaby: Fixed indexing of geohashes.
- #2004596 by drunken monkey: Fixed "More Like This" for Solr 4.x.
- #2007214 by drunken monkey: Fixed unsetting of object properties.
- #1884312 by drunken monkey, mvc: Fixed resetting of HTTP password upon
re-saving of the configuration form.
- #1957774 by drunken monkey: Fixed displayed link to local Solr servers.
- #1721262 by Steven Jones, das-peter, drunken monkey: Added field collapsing
support.
- #1549244 by cferthorney, drunken monkey: Added SSL Support for Solr servers.
Search API Solr search 1.0, RC 5 (05/17/2013):
----------------------------------------------
- #1190462 by drunken monkey: Documented that enabling HTML filter makes sense.
- #1986284 by drunken monkey: Updated common configs to the latest version.
- #1990422 by populist, drunken monkey: Added support for custom stream contexts
for HTTP requests.
- #1957890 by drunken monkey, jwilson3: Fixed several bugs for facets.
- #1676224 by dasjo, morningtime, drunken monkey: Added support for Solr 4.x.
- #1985522 by chaby: Fixed use of instance method in static escape() method.
- #1979102 by drunken monkey: Fixed wrong limit for limit-less searches.
- #1978632 by chaby, drunken monkey: Fixed wrong check on softCommit.
- #1978600 by chaby: Fixed hook_requirements() for install phase.
- #1976930 by drunken monkey: Fixed duplicate method in SearchApiSolrField.
Search API Solr search 1.0, RC 4 (04/22/2013):
----------------------------------------------
- #1744250 by mollux, drunken monkey, das-peter: Added support for
location-based searches.
- #1846254 by drunken monkey: Removed the SolrPhpClient dependency.
- #1934450 by jwilson3, jlapp: Fixed reference to removed method
getFacetField().
- #1900644 by Deciphered: Fixed facet handling for multi-index searches.
- #1897386 by drunken monkey, NIck_vh: Update the common schema.
Search API Solr search 1.0, RC 3 (01/06/2013):
----------------------------------------------
- #1828260 by drunken monkey: Fixed filtering by index in multi-index searches.
- #1509380 by drunken monkey: Adopt common config files.
- #1815348 by drunken monkey: Fixed queryMultiple() to not use item ID as the
array key.
- #1789204 by Steven Jones: Added way to easily alter the fl parameter.
- #1744250 by mollux, dasjo: Added support for location based search.
- #1813670 by guillaumev: Fixed check for autocomplete configuration in form.
- #1425910 by drunken monkey, mh86: Added setting for maximum occurence
threshold in autocomplete.
- #1691132 by drunken monkey, David Stosik: Fixed calls to watchdog().
- #1588130 by regilero, David Stosik, drunken monkey: Fixed error handling.
- #1805720 by drunken monkey: Added additional options and improvements for the
autocomplete functionality.
- #1276970 by derhasi, moonray: Fixed large queries break Solr search.
- #1299940 by drunken monkey: Fixed handling of empty response.
- #1507818 by larowlan: Fixed field boosts for standard request handler.
Search API Solr search 1.0, RC 2 (05/23/2012):
----------------------------------------------
- Fixed escaping of error messages.
- #1480170 by kotnik: Fixed return value of hook_requirements().
- #1500210 by ezra-g, acrollet, jsacksick: Fixed errors when installing with
non-default installation profiles.
- #1444432 by Damien Tournoud, jsacksick: Added field-level boosting.
- #1302406 by Steven Jones: Fixed autoload problem during installation.
- #1340244 by drunken monkey, alanomaly: Added more helpful error messages.
Search API Solr search 1.0, RC 1 (11/10/2011):
----------------------------------------------
- #1308638 by drunken monkey: Adapted to new structure of field settings.
- #1308498 by zenlan, drunken monkey: Added flexibility for facet fields.
- #1319544 by drunken monkey: Fixed never delete contents of read-only indexes.
- #1309650 by jonhattan, drunken monkey: Added support for the Libraries API.
Search API Solr search 1.0, Beta 4 (09/08/2011):
------------------------------------------------
- #1230536 by thegreat, drunken monkey: Added support for OR facets.
- #1184002 by drunken monkey: Fixed support of the latest SolrPhpClient version.
- #1032848 by das-peter, drunken monkey: Added possibility to save SolrPhpClient
to the libraries directory.
- #1225926 by drunken monkey, fago: Fixed performance problems in indexing
workflow.
- #1219310 by drunken monkey: Adapted to recent API change.
- #1203680 by klausi: Fixed use of taxonomy terms for "More like this".
- #1181260 by klausi: Fixed mlt.maxwl in solrconfig.xml.
- #1116896 by drunken monkey: Adapted to newer Solr versions.
- #1190462 by drunken monkey: Added option to directly highlight retrieved data
from Solr.
- #1196514 by drunken monkey, klausi: Fixed case sensitivity of input keys for
autocomplete.
- #1192654 by drunken monkey: Added support for the Autocomplete module.
- #1177648 by drunken monkey: Added option to use Solr's built-in highlighting.
- #1154116 by drunken monkey: Added option for retrieving search results data
directly from Solr.
- #1184002 by drunken monkey: Fixed INSTALL.txt to reflect that the module
doesn't work with the latest Solr PHP Client version.
Search API Solr search 1.0, Beta 3 (06/06/2011):
------------------------------------------------
- #1111852 by miiimooo, drunken monkey: Added a 'More like this' feature.
- #1153306 by JoeMcGuire, drunken monkey: Added spellchecking support.
- #1138230 by becw, drunken monkey: Added increased flexibility to the service
class.
- #1127038 by drunken monkey: Fixed handling of date facets.
- #1110820 by becw, drunken monkey: Added support for the Luke request handler.
- #1095956 by drunken monkey: Added Solr-specific index alter hook.
Search API Solr search 1.0, Beta 2 (03/04/2011):
------------------------------------------------
- #1071894 by drunken monkey: Fixed incorrect handling of boolean facets.
- #1071796: Add additional help for Solr-specific extensions.
- #1056018: Better document Solr config customization options.
- #1049900: Field values are sometimes not escaped properly.
- #1043586: Allow Solr server URL to be altered.
- #1010610 by mikejoconnor: Fix hook_requirements().
- #1024146: Don't use file_get_contents() for contacting the Solr server.
- #1010610: More helpful error message when SolrPhpClient is missing.
- #915174: Remove unnecessary files[] declarations from .info file.
- #984134: Add Solr-specific query alter hooks.
Search API Solr search 1.0, Beta 1 (11/29/2010):
------------------------------------------------
Basic functionality is in place and quite well-tested, including support for
facets and for multi-index searches.

View File

@ -0,0 +1,79 @@
Setting up Solr
---------------
In order for this module to work, you will first need to set up a Solr server.
For this, you can either purchase a server from a web Solr hosts or set up your
own Solr server on your web server (if you have the necessary rights to do so).
If you want to use a hosted solution, a number of companies are listed on the
module's project page [1]. Otherwise, please follow the instructions below.
A more detailed set of instructions is available at [2].
[1] https://drupal.org/project/search_api_solr
[2] https://drupal.org/node/1999310
As a pre-requisite for running your own Solr server, you'll need Java 6 or
higher.
Download the latest version of Solr 4.x from [3] and unpack the archive
somewhere outside of your web server's document tree.
[3] http://www.apache.org/dyn/closer.cgi/lucene/solr/
This module also supports Solr 1.4 and 3.x. For better performance and more
features, 4.x should be used, though. 1.4 is discouraged altogether, as several
features of the module don't work at all in 1.4.
For small websites, using the example application, located in $SOLR/example/,
usually suffices. In any case, you can use it for developing andd testing. The
following instructions will assume you are using the example application,
otherwise you should be able to substitute the corresponding paths.
NOTE: The Solr 4.3+ example application is currently not completely supported
with the configuration files included in this module, due to a slight change in
directory structure. To fix this, simply copy, move or symlink the contrib/
directory from the top level of the extracted Solr package one level down to
example/.
(For other directory structures: the contrib/ directory has to be in the
directory two levels up from the one which includes the conf/ directory. For
help, just start the Solr server and check the log files for WARN messages
they should state in which place Solr expects the directory to be.)
CAUTION! For production sites, it is vital that you somehow prevent outside
access to the Solr server. Otherwise, attackers could read, corrupt or delete
all your indexed data. Using the example server WON'T prevent this by default.
If it is available, the probably easiest way of preventing this is to disable
outside access to the ports used by Solr through your server's network
configuration or through the use of a firewall.
Other options include adding basic HTTP authentication or renaming the solr/
directory to a random string of characters and using that as the path.
Before starting the Solr server you will have to make sure it uses the proper
configuration files. These are located in the solr-conf/ directory in this
module, in a sub-directory according to the Solr version you are using. Copy all
the files from that directory into Solr's configuration directory
($SOLR/example/solr/collection1/conf/ in case of the 4.x example application),
after backing up all files that would be overwritten.
NOTE: The mapping-ISOLatin1Accent.txt is only included in the module for
completeness' sake, as it is required to start the Solr server. It will be
usually advisable to just use the file of the example application in this case,
though, as it contains really useful definitions, while the file provided with
this module is empty, apart from some documentation. For licensing reasons, it
is not possible for us to include the definitions in the example config file in
the copy this module provides.
You can then start Solr. For the example application, go to $SOLR/example/ and
issue the following command (assuming Java is correctly installed):
java -jar start.jar &
Afterwards, go to [4] in your web browser to ensure Solr is running correctly.
[4] http://localhost:8983/solr/#/
You can then enable this module and create a new server, using the "Solr search"
service class. Enter the hostname, port and path corresponding to your Solr
server in the appropriate fields. The default values already correspond to the
example application, so you won't have to change the values if you use that.
If you are using HTTP Authentication to protect your Solr server you also have
to provide the appropriate user and password here.

View File

@ -0,0 +1,339 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.

View File

@ -0,0 +1,155 @@
Solr search
-----------
This module provides an implementation of the Search API which uses an Apache
Solr search server for indexing and searching. Before enabling or using this
module, you'll have to follow the instructions given in INSTALL.txt first.
For more detailed documentation, see the handbook [1].
[1] https://drupal.org/node/1999280
Supported optional features
---------------------------
All Search API datatypes are supported by using appropriate Solr datatypes for
indexing them. By default, "String"/"URI" and "Integer"/"Duration" are defined
equivalently. However, through manual configuration of the used schema.xml this
can be changed arbitrarily. Using your own Solr extensions is thereby also
possible.
The "direct" parse mode for queries will result in the keys being directly used
as the query to Solr. For details about Lucene's query syntax, see [2]. There
are also some Solr additions to this, listed at [3]. Note however that, by
default, this module uses the dismax query handler, so searches like
"field:value" won't work with the "direct" mode.
[2] http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
[3] http://wiki.apache.org/solr/SolrQuerySyntax
Regarding third-party features, the following are supported:
- search_api_autocomplete
Introduced by module: search_api_autocomplete
Lets you add autocompletion capabilities to search forms on the site. (See
also "Hidden variables" below for Solr-specific customization.)
- search_api_facets
Introduced by module: search_api_facetapi
Allows you to create facetted searches for dynamically filtering search
results.
- search_api_facets_operator_or
Introduced by module: search_api_facetapi
Allows the creation of OR facets.
- search_api_mlt
Introduced by module: search_api_views
Lets you display items that are similar to a given one. Use, e.g., to create
a "More like this" block for node pages.
NOTE: Due to a regression in Solr itself, "More like this" doesn't work with
integer and float fields in Solr 4. As a work-around, you can index the fields
(or copies of them) as string values. See [4] for details.
Also, MLT with date fields isn't currently supported at all for any version.
- search_api_multi
Introduced by module: search_api_multi
Allows you to search multiple indexes at once, as long as they are on the same
server. You can use this to let users simultaneously search all content on the
site nodes, comments, user profiles, etc.
- search_api_spellcheck
Introduced by module: search_api_spellcheck
Gives the option to display automatic spellchecking for searches.
- search_api_data_type_location
Introduced by module: search_api_location
Lets you index, filter and sort on location fields. Note, however, that only
single-valued fields are currently supported for Solr 3.x, and that the option
isn't supported at all in Solr 1.4.
- search_api_grouping
Introduced by module: search_api_grouping [5]
Lets you group search results based on indexed fields. For further information
see the FieldCollapsing documentation in the solr wiki [6].
If you feel some service option is missing, or have other ideas for improving
this implementation, please file a feature request in the project's issue queue,
at [7].
[4] https://drupal.org/node/2004596
[5] https://drupal.org/sandbox/daspeter/1783280
[6] http://wiki.apache.org/solr/FieldCollapsing
[7] https://drupal.org/project/issues/search_api_solr
Specifics
---------
Please consider that, since Solr handles tokenizing, stemming and other
preprocessing tasks, activating any preprocessors in a search index' settings is
usually not needed or even cumbersome. If you are adding an index to a Solr
server you should therefore then disable all processors which handle such
classic preprocessing tasks. Enabling the HTML filter can be useful, though, as
the default config files included in this module don't handle stripping out HTML
tags.
Also, due to the way Solr works, using a single field for fulltext searching
will result in the smallest index size and best search performance, as well as
possibly having other advantages, too. Therefore, if you don't need to search
different sets of fields in different searches on an index, it is adviced that
you collect all fields that should be searchable into a single field using the
“Aggregated fields” data alteration.
Clean field identifiers:
If your Solr server was created in a module version prior to 1.2, you will get
the option to switch the server to "Clean field identifiers" (which is default
for all new servers). This will change the Solr field names used for all
fields whose Search API identifiers contain a colon (i.e., all nested fields)
to support some advanced functionality, like sorting by distance, for which
Solr is buggy when using field names with colons.
The only downside of this change is that the data in Solr for these fields
will become invalid, so all indexes on the server which contain such fields
will be scheduled for re-indexing. (If you don't want to search on incomplete
data until the re-indexing is finished, you can additionally manually clear
the indexes, on their Status tabs, to prevent this.)
Hidden variables
----------------
- search_api_solr_autocomplete_max_occurrences (default: 0.9)
By default, keywords that occur in more than 90% of results are ignored for
autocomplete suggestions. This setting lets you modify that behaviour by
providing your own ratio. Use 1 or greater to use all suggestions.
- search_api_solr_index_prefix (default: '')
By default, the index ID in the Solr server is the same as the index's machine
name in Drupal. This setting will let you specify a prefix for the index IDs
on this Drupal installation. Only use alphanumeric characters and underscores.
Since changing the prefix makes the currently indexed data inaccessible, you
should change this vairable only when no indexes are currently on any Solr
servers.
- search_api_solr_index_prefix_INDEX_ID (default: '')
Same as above, but a per-index prefix. Use the index's machine name as
INDEX_ID in the variable name. Per-index prefixing is done before the global
prefix is added, so the global prefix will come first in the final name:
(GLOBAL_PREFIX)(INDEX_PREFIX)(INDEX_ID)
The same rules as above apply for setting the prefix.
- search_api_solr_http_get_max_length (default: 4000)
The maximum number of bytes that can be handled as an HTTP GET query when
HTTP method is AUTO. Typically Solr can handle up to 65355 bytes, but Tomcat
and Jetty will error at slightly less than 4096 bytes.
Customizing your Solr server
----------------------------
The schema.xml and solrconfig.xml files contain extensive comments on how to
add additional features or modify behaviour, e.g., for adding a language-
specific stemmer or a stopword list.
If you are interested in further customizing your Solr server to your needs,
see the Solr wiki at [8] for documentation. When editing the schema.xml and
solrconfig.xml files, please only edit the copies in the Solr configuration
directory, not directly the ones provided with this module.
[8] http://wiki.apache.org/solr/
You'll have to restart your Solr server after making such changes, for them to
take effect.
Developers
----------
The SearchApiSolrService class has a few custom extensions, documented with its
code. Methods of note are deleteItems(), which treats the first argument
differently in certain cases, and the methods at the end of service.inc.

View File

@ -0,0 +1,435 @@
<?php
/**
* Copyright (c) 2007-2009, Conduit Internet Technologies, Inc.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* - Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* - Neither the name of Conduit Internet Technologies, Inc. nor the names of
* its contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* @copyright Copyright 2007-2009 Conduit Internet Technologies, Inc. (http://conduit-it.com)
* @license New BSD (http://solr-php-client.googlecode.com/svn/trunk/COPYING)
* @version $Id: Document.php 15 2009-08-04 17:53:08Z donovan.jimenez $
*
* @package Apache
* @subpackage Solr
* @author Donovan Jimenez <djimenez@conduit-it.com>
*/
/**
* Additional code Copyright (c) 2011 by Peter Wolanin, and
* additional contributors.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or (at
* your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
* or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program as the file LICENSE.txt; if not, please see
* http://www.gnu.org/licenses/old-licenses/gpl-2.0.txt.
*/
/**
* Holds Key / Value pairs that represent a Solr Document along with any
* associated boost values. Field values can be accessed by direct dereferencing
* such as:
*
* @code
* $document->title = 'Something';
* echo $document->title;
* @endcode
*
* Additionally, the field values can be iterated with foreach:
*
* @code
* foreach ($document as $fieldName => $fieldValue) {
* // ...
* }
* @endcode
*/
class SearchApiSolrDocument implements IteratorAggregate {
/**
* Document boost value.
*
* @var float|false
*/
protected $documentBoost = FALSE;
/**
* Document field values, indexed by name.
*
* @var array
*/
protected $fields = array();
/**
* Document field boost values, indexed by name.
*
* @var array
*/
protected $fieldBoosts = array();
/**
* Clears all boosts and fields from this document.
*/
public function clear() {
$this->documentBoost = FALSE;
$this->fields = array();
$this->fieldBoosts = array();
}
/**
* Gets the current document boost.
*
* @return float|false
* The current document boost, or FALSE if none is set.
*/
public function getBoost() {
return $this->documentBoost;
}
/**
* Sets the document boost factor.
*
* @param float|false $boost
* FALSE for default boost, or a positive number for setting a document
* boost.
*/
public function setBoost($boost) {
$boost = (float) $boost;
if ($boost > 0.0) {
$this->documentBoost = $boost;
}
else {
$this->documentBoost = FALSE;
}
}
/**
* Adds a value to a multi-valued field
*
* NOTE: the solr XML format allows you to specify boosts PER value even
* though the underlying Lucene implementation only allows a boost per field.
* To remedy this, the final field boost value will be the product of all
* specified boosts on field values - this is similar to SolrJ's
* functionality.
*
* @code
* $doc = new ApacheSolrDocument();
* $doc->addField('foo', 'bar', 2.0);
* $doc->addField('foo', 'baz', 3.0);
* // Resultant field boost will be 6!
* echo $doc->getFieldBoost('foo');
* @endcode
*
* @param string $key
* The name of the field.
* @param $value
* The value to add for the field.
* @param float|false $boost
* FALSE for default boost, or a positive number for setting a field boost.
*/
public function addField($key, $value, $boost = FALSE) {
if (!isset($this->fields[$key])) {
// create holding array if this is the first value
$this->fields[$key] = array();
}
else if (!is_array($this->fields[$key])) {
// move existing value into array if it is not already an array
$this->fields[$key] = array($this->fields[$key]);
}
if ($this->getFieldBoost($key) === FALSE) {
// boost not already set, set it now
$this->setFieldBoost($key, $boost);
}
else if ((float) $boost > 0.0) {
// multiply passed boost with current field boost - similar to SolrJ implementation
$this->fieldBoosts[$key] *= (float) $boost;
}
// add value to array
$this->fields[$key][] = $value;
}
/**
* Gets information about a field stored in Solr.
*
* @param string $key
* The name of the field.
*
* @return array|false
* An associative array of info if the field exists, FALSE otherwise.
*/
public function getField($key) {
if (isset($this->fields[$key])) {
return array(
'name' => $key,
'value' => $this->fields[$key],
'boost' => $this->getFieldBoost($key)
);
}
return FALSE;
}
/**
* Sets a field value.
*
* Multi-valued fields should be set as arrays or via the addField()
* function which will automatically make sure the field is an array.
*
* @param string $key
* The name of the field.
* @param string|array $value
* The value to set for the field.
* @param float|false $boost
* FALSE for default boost, or a positive number for setting a field boost.
*/
public function setField($key, $value, $boost = FALSE) {
$this->fields[$key] = $value;
$this->setFieldBoost($key, $boost);
}
/**
* Gets the currently set field boost for a document field.
*
* @param string $key
* The name of the field.
*
* @return float|false
* The currently set field boost, or FALSE if none was set.
*/
public function getFieldBoost($key) {
return isset($this->fieldBoosts[$key]) ? $this->fieldBoosts[$key] : FALSE;
}
/**
* Sets the field boost for a document field.
*
* @param string $key
* The name of the field.
* @param float|false $boost
* FALSE for default boost, or a positive number for setting a field boost.
*/
public function setFieldBoost($key, $boost) {
$boost = (float) $boost;
if ($boost > 0.0) {
$this->fieldBoosts[$key] = $boost;
}
else {
$this->fieldBoosts[$key] = FALSE;
}
}
/**
* Returns all current field boosts, indexed by field name.
*
* @return array
* An associative array in the format $field_name => $field_boost.
*/
public function getFieldBoosts() {
return $this->fieldBoosts;
}
/**
* Gets the names of all fields in this document.
*
* @return array
* The names of all fields in this document.
*/
public function getFieldNames() {
return array_keys($this->fields);
}
/**
* Gets the values of all fields in this document.
*
* @return array
* The values of all fields in this document.
*/
public function getFieldValues() {
return array_values($this->fields);
}
/**
* Implements IteratorAggregate::getIterator().
*
* Implementing the IteratorAggregate interface allows the following usage:
* @code
* foreach ($document as $key => $value) {
* // ...
* }
* @endcode
*
* @return Traversable
* An iterator over this document's fields.
*/
public function getIterator() {
$arrayObject = new ArrayObject($this->fields);
return $arrayObject->getIterator();
}
/**
* Magic getter for field values.
*
* @param string $key
* The name of the field.
*
* @return string|array|null
* The value that was set for the field.
*/
public function __get($key) {
return $this->fields[$key];
}
/**
* Magic setter for field values.
*
* Multi-valued fields should be set as arrays or via the addField() function
* which will automatically make sure the field is an array.
*
* @param string $key
* The name of the field.
* @param string|array $value
* The value to set for the field.
*/
public function __set($key, $value) {
$this->setField($key, $value);
}
/**
* Magic isset for fields values.
*
* Do not call directly. Allows the following usage:
* @code
* isset($document->some_field);
* @endcode
*
* @param string $key
* The name of the field.
*
* @return bool
* Whether the given key is set in this document.
*/
public function __isset($key) {
return isset($this->fields[$key]);
}
/**
* Magic unset for field values.
*
* Do not call directly. Allows the following usage:
* @code
* unset($document->some_field);
* @endcode
*
* @param string $key
* The name of the field.
*/
public function __unset($key) {
unset($this->fields[$key]);
unset($this->fieldBoosts[$key]);
}
/**
* Create an XML fragment from this document.
*
* This string can then be used inside a Solr add call.
*
* @return string
* An XML formatted string for this document.
*/
public function toXml() {
$xml = '<doc';
if ($this->documentBoost !== FALSE) {
$xml .= ' boost="' . $this->documentBoost . '"';
}
$xml .= '>';
foreach ($this->fields as $key => $value) {
$fieldBoost = $this->getFieldBoost($key);
$key = htmlspecialchars($key, ENT_COMPAT, 'UTF-8');
if (is_array($value)) {
foreach ($value as $multivalue) {
$xml .= '<field name="' . $key . '"';
if ($fieldBoost !== FALSE) {
$xml .= ' boost="' . $fieldBoost . '"';
// Only set the boost for the first field in the set.
$fieldBoost = FALSE;
}
$xml .= '>' . htmlspecialchars($multivalue, ENT_NOQUOTES, 'UTF-8') . '</field>';
}
}
else {
$xml .= '<field name="' . $key . '"';
if ($fieldBoost !== FALSE) {
$xml .= ' boost="' . $fieldBoost . '"';
}
$xml .= '>' . htmlspecialchars($value, ENT_NOQUOTES, 'UTF-8') . '</field>';
}
}
$xml .= '</doc>';
// Remove any control characters to avoid Solr XML parser exception.
return self::stripCtrlChars($xml);
}
/**
* Sanitizes XML for sending to Solr.
*
* Replaces control (non-printable) characters that are invalid to Solr's XML
* parser with a space.
*
* @param string $string
* The string to sanitize.
*
* @return string
* A string safe for including in a Solr request.
*/
public static function stripCtrlChars($string) {
// See: http://w3.org/International/questions/qa-forms-utf-8.html
// Printable utf-8 does not include any of these chars below x7F
return preg_replace('@[\x00-\x08\x0B\x0C\x0E-\x1F]@', ' ', $string);
}
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,905 @@
<?php
/**
* Copyright (c) 2007-2009, Conduit Internet Technologies, Inc.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* - Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* - Neither the name of Conduit Internet Technologies, Inc. nor the names of
* its contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* @copyright Copyright 2007-2009 Conduit Internet Technologies, Inc. (http://conduit-it.com)
* @license New BSD (http://solr-php-client.googlecode.com/svn/trunk/COPYING)
* @version $Id: Service.php 22 2009-11-09 22:46:54Z donovan.jimenez $
*
* @package Apache
* @subpackage Solr
* @author Donovan Jimenez <djimenez@conduit-it.com>
*/
/**
* Additional code Copyright (c) 2008-2011 by Robert Douglass, James McKinney,
* Jacob Singh, Alejandro Garza, Peter Wolanin, and additional contributors.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or (at
* your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
* or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program as the file LICENSE.txt; if not, please see
* http://www.gnu.org/licenses/old-licenses/gpl-2.0.txt.
*/
/**
* Represents a Solr server resource.
*
* Contains methods for pinging, adding, deleting, committing, optimizing and
* searching.
*/
class SearchApiSolrConnection implements SearchApiSolrConnectionInterface {
/**
* Defines how NamedLists should be formatted in the output.
*
* This specifically affects facet counts. Valid values are 'map' (default) or
* 'flat'.
*/
const NAMED_LIST_FORMAT = 'map';
/**
* Path to the ping servlet.
*/
const PING_SERVLET = 'admin/ping';
/**
* Path to the update servlet.
*/
const UPDATE_SERVLET = 'update';
/**
* Path to the search servlet.
*/
const SEARCH_SERVLET = 'select';
/**
* Path to the luke servlet.
*/
const LUKE_SERVLET = 'admin/luke';
/**
* Path to the system servlet.
*/
const SYSTEM_SERVLET = 'admin/system';
/**
* Path to the stats servlet.
*/
const STATS_SERVLET = 'admin/stats.jsp';
/**
* Path to the stats servlet for Solr 4.x servers.
*/
const STATS_SERVLET_4 = 'admin/mbeans?wt=xml&stats=true';
/**
* Path to the file servlet.
*/
const FILE_SERVLET = 'admin/file';
/**
* The options passed when creating this connection.
*
* @var array
*/
protected $options;
/**
* The Solr server's URL.
*
* @var string
*/
protected $base_url;
/**
* Cached URL to the update servlet.
*
* @var string
*/
protected $update_url;
/**
* HTTP Basic Authentication header to set for requests to the Solr server.
*
* @var string
*/
protected $http_auth;
/**
* The stream context to use for requests to the Solr server.
*
* Defaults to NULL (= pass no context at all).
*
* @var string
*/
protected $stream_context;
/**
* Cache for the metadata from admin/luke.
*
* Contains an array of response objects, keyed by the number of "top terms".
*
* @var array
*
* @see getLuke()
*/
protected $luke = array();
/**
* Cache for information about the Solr core.
*
* @var SimpleXMLElement
*
* @see getStats()
*/
protected $stats;
/**
* Cache for system information.
*
* @var array
*
* @see getSystemInfo()
*/
protected $system_info;
/**
* Flag that denotes whether to use soft commits for Solr 4.x.
*
* Defaults to FALSE.
*
* @var bool
*/
protected $soft_commit = FALSE;
/**
* Implements SearchApiSolrConnectionInterface::__construct().
*
* Valid options include:
* - scheme: Scheme of the base URL of the Solr server. Most probably "http"
* or "https". Defaults to "http".
* - host: The host name (or IP) of the Solr server. Defaults to
* "localhost".
* - port: The port of the Solr server. Defaults to 8983.
* - path: The base path to the Solr server. Defaults to "/solr/".
* - http_user: If both this and "http_pass" are set, will use this
* information to add basic HTTP authentication to all requests to the
* Solr server. Not set by default.
* - http_pass: See "http_user".
*/
public function __construct(array $options) {
$options += array(
'scheme' => 'http',
'host' => 'localhost',
'port' => 8983,
'path' => 'solr',
'http_user' => NULL,
'http_pass' => NULL,
);
$this->options = $options;
$path = '/' . trim($options['path'], '/') . '/';
$this->base_url = $options['scheme'] . '://' . $options['host'] . ':' . $options['port'] . $path;
// Set HTTP Basic Authentication parameter, if login data was set.
if (strlen($options['http_user']) && strlen($options['http_pass'])) {
$this->http_auth = 'Basic ' . base64_encode($options['http_user'] . ':' . $options['http_pass']);
}
}
/**
* Implements SearchApiSolrConnectionInterface::ping().
*/
public function ping($timeout = 2) {
$start = microtime(TRUE);
if ($timeout <= 0.0) {
$timeout = -1;
}
$pingUrl = $this->constructUrl(self::PING_SERVLET);
// Attempt a HEAD request to the Solr ping url.
$options = array(
'method' => 'HEAD',
'timeout' => $timeout,
);
$response = $this->makeHttpRequest($pingUrl, $options);
if ($response->code == 200) {
// Add 1 µs to the ping time so we never return 0.
return (microtime(TRUE) - $start) + 1E-6;
}
else {
return FALSE;
}
}
/**
* Implements SearchApiSolrConnectionInterface::setSoftCommit().
*/
public function setSoftCommit($soft_commit) {
$this->soft_commit = (bool) $soft_commit;
}
/**
* Implements SearchApiSolrConnectionInterface::getSoftCommit().
*/
public function getSoftCommit() {
return $this->soft_commit;
}
/**
* Implements SearchApiSolrConnectionInterface::setStreamContext().
*/
public function setStreamContext($stream_context) {
$this->stream_context = $stream_context;
}
/**
* Implements SearchApiSolrConnectionInterface::getStreamContext().
*/
public function getStreamContext() {
return $this->stream_context;
}
/**
* Computes the cache ID to use for this connection.
*
* @param $suffix
* (optional) A suffix to append to the string to make it unique.
*
* @return string|null
* The cache ID to use for this connection and usage; or NULL if no caching
* should take place.
*/
protected function getCacheId($suffix = '') {
if (!empty($this->options['server'])) {
$cid = $this->options['server'];
return $suffix ? "$cid:$suffix" : $cid;
}
}
/**
* Call the /admin/system servlet to retrieve system information.
*
* Stores the retrieved information in $system_info.
*
* @see getSystemInfo()
*/
protected function setSystemInfo() {
$cid = $this->getCacheId(__FUNCTION__);
if ($cid) {
$cache = cache_get($cid, 'cache_search_api_solr');
if ($cache) {
$this->system_info = json_decode($cache->data);
}
}
// Second pass to populate the cache if necessary.
if (empty($this->system_info)) {
$url = $this->constructUrl(self::SYSTEM_SERVLET, array('wt' => 'json'));
$response = $this->sendRawGet($url);
$this->system_info = json_decode($response->data);
if ($cid) {
cache_set($cid, $response->data, 'cache_search_api_solr');
}
}
}
/**
* Implements SearchApiSolrConnectionInterface::getSystemInfo().
*/
public function getSystemInfo() {
if (!isset($this->system_info)) {
$this->setSystemInfo();
}
return $this->system_info;
}
/**
* Sets $this->luke with the metadata about the index from admin/luke.
*/
protected function setLuke($num_terms = 0) {
if (empty($this->luke[$num_terms])) {
$cid = $this->getCacheId(__FUNCTION__ . ":$num_terms");
if ($cid) {
$cache = cache_get($cid, 'cache_search_api_solr');
if (isset($cache->data)) {
$this->luke = $cache->data;
}
}
// Second pass to populate the cache if necessary.
if (empty($this->luke[$num_terms])) {
$params = array(
'numTerms' => "$num_terms",
'wt' => 'json',
'json.nl' => self::NAMED_LIST_FORMAT,
);
$url = $this->constructUrl(self::LUKE_SERVLET, $params);
$this->luke[$num_terms] = $this->sendRawGet($url);
if ($cid) {
cache_set($cid, $this->luke, 'cache_search_api_solr');
}
}
}
}
/**
* Implements SearchApiSolrConnectionInterface::getFields().
*/
public function getFields($num_terms = 0) {
$fields = array();
foreach ($this->getLuke($num_terms)->fields as $name => $info) {
$fields[$name] = new SearchApiSolrField($info);
}
return $fields;
}
/**
* Implements SearchApiSolrConnectionInterface::getLuke().
*/
public function getLuke($num_terms = 0) {
if (!isset($this->luke[$num_terms])) {
$this->setLuke($num_terms);
}
return $this->luke[$num_terms];
}
/**
* Implements SearchApiSolrConnectionInterface::getSolrVersion().
*/
public function getSolrVersion() {
$system_info = $this->getSystemInfo();
// Get our solr version number
if (isset($system_info->lucene->{'solr-spec-version'})) {
return $system_info->lucene->{'solr-spec-version'}[0];
}
return 0;
}
/**
* Stores information about the Solr core in $this->stats.
*/
protected function setStats() {
$data = $this->getLuke();
$solr_version = $this->getSolrVersion();
// Only try to get stats if we have connected to the index.
if (empty($this->stats) && isset($data->index->numDocs)) {
$cid = $this->getCacheId(__FUNCTION__);
if ($cid) {
$cache = cache_get($cid, 'cache_search_api_solr');
if (isset($cache->data)) {
$this->stats = simplexml_load_string($cache->data);
}
}
// Second pass to populate the cache if necessary.
if (empty($this->stats)) {
if ($solr_version >= 4) {
$url = $this->constructUrl(self::STATS_SERVLET_4);
}
else {
$url = $this->constructUrl(self::STATS_SERVLET);
}
$response = $this->sendRawGet($url);
$this->stats = simplexml_load_string($response->data);
if ($cid) {
cache_set($cid, $response->data, 'cache_search_api_solr');
}
}
}
}
/**
* Implements SearchApiSolrConnectionInterface::getStats().
*/
public function getStats() {
if (!isset($this->stats)) {
$this->setStats();
}
return $this->stats;
}
/**
* Implements SearchApiSolrConnectionInterface::getStatsSummary().
*/
public function getStatsSummary() {
$stats = $this->getStats();
$solr_version = $this->getSolrVersion();
$summary = array(
'@pending_docs' => '',
'@autocommit_time_seconds' => '',
'@autocommit_time' => '',
'@deletes_by_id' => '',
'@deletes_by_query' => '',
'@deletes_total' => '',
'@schema_version' => '',
'@core_name' => '',
'@index_size' => '',
);
if (!empty($stats)) {
if ($solr_version <= 3) {
$docs_pending_xpath = $stats->xpath('//stat[@name="docsPending"]');
$summary['@pending_docs'] = (int) trim(current($docs_pending_xpath));
$max_time_xpath = $stats->xpath('//stat[@name="autocommit maxTime"]');
$max_time = (int) trim(current($max_time_xpath));
// Convert to seconds.
$summary['@autocommit_time_seconds'] = $max_time / 1000;
$summary['@autocommit_time'] = format_interval($max_time / 1000);
$deletes_id_xpath = $stats->xpath('//stat[@name="deletesById"]');
$summary['@deletes_by_id'] = (int) trim(current($deletes_id_xpath));
$deletes_query_xpath = $stats->xpath('//stat[@name="deletesByQuery"]');
$summary['@deletes_by_query'] = (int) trim(current($deletes_query_xpath));
$summary['@deletes_total'] = $summary['@deletes_by_id'] + $summary['@deletes_by_query'];
$schema = $stats->xpath('/solr/schema[1]');
$summary['@schema_version'] = trim($schema[0]);
$core = $stats->xpath('/solr/core[1]');
$summary['@core_name'] = trim($core[0]);
$size_xpath = $stats->xpath('//stat[@name="indexSize"]');
$summary['@index_size'] = trim(current($size_xpath));
}
else {
$system_info = $this->getSystemInfo();
$docs_pending_xpath = $stats->xpath('//lst["stats"]/long[@name="docsPending"]');
$summary['@pending_docs'] = (int) trim(current($docs_pending_xpath));
$max_time_xpath = $stats->xpath('//lst["stats"]/str[@name="autocommit maxTime"]');
$max_time = (int) trim(current($max_time_xpath));
// Convert to seconds.
$summary['@autocommit_time_seconds'] = $max_time / 1000;
$summary['@autocommit_time'] = format_interval($max_time / 1000);
$deletes_id_xpath = $stats->xpath('//lst["stats"]/long[@name="deletesById"]');
$summary['@deletes_by_id'] = (int) trim(current($deletes_id_xpath));
$deletes_query_xpath = $stats->xpath('//lst["stats"]/long[@name="deletesByQuery"]');
$summary['@deletes_by_query'] = (int) trim(current($deletes_query_xpath));
$summary['@deletes_total'] = $summary['@deletes_by_id'] + $summary['@deletes_by_query'];
$schema = $system_info->core->schema;
$summary['@schema_version'] = $schema;
$core = $stats->xpath('//lst["core"]/str[@name="coreName"]');
$summary['@core_name'] = trim(current($core));
$size_xpath = $stats->xpath('//lst["core"]/str[@name="indexSize"]');
$summary['@index_size'] = trim(current($size_xpath));
}
}
return $summary;
}
/**
* Implements SearchApiSolrConnectionInterface::clearCache().
*/
public function clearCache() {
if ($cid = $this->getCacheId()) {
cache_clear_all($cid, 'cache_search_api_solr', TRUE);
cache_clear_all($cid, 'cache_search_api_solr', TRUE);
}
$this->luke = array();
$this->stats = NULL;
$this->system_info = NULL;
}
/**
* Checks the reponse code and throws an exception if it's not 200.
*
* @param object $response
* A response object.
*
* @return object
* The passed response object.
*
* @throws SearchApiException
* If the object's HTTP status is not 200.
*/
protected function checkResponse($response) {
$code = (int) $response->code;
if ($code != 200) {
if ($code >= 400 && $code != 403 && $code != 404) {
// Add details, like Solr's exception message.
$response->status_message .= $response->data;
}
throw new SearchApiException('"' . $code . '" Status: ' . $response->status_message);
}
return $response;
}
/**
* Implements SearchApiSolrConnectionInterface::makeServletRequest().
*/
public function makeServletRequest($servlet, array $params = array(), array $options = array()) {
// Add default params.
$params += array(
'wt' => 'json',
'json.nl' => self::NAMED_LIST_FORMAT,
);
$url = $this->constructUrl($servlet, $params);
$response = $this->makeHttpRequest($url, $options);
return $this->checkResponse($response);
}
/**
* Central method for making a GET operation against this Solr Server
*/
protected function sendRawGet($url, array $options = array()) {
$options['method'] = 'GET';
$response = $this->makeHttpRequest($url, $options);
return $this->checkResponse($response);
}
/**
* Central method for making a POST operation against this Solr Server
*/
protected function sendRawPost($url, array $options = array()) {
$options['method'] = 'POST';
// Normally we use POST to send XML documents.
if (empty($options['headers']['Content-Type'])) {
$options['headers']['Content-Type'] = 'text/xml; charset=UTF-8';
}
$response = $this->makeHttpRequest($url, $options);
return $this->checkResponse($response);
}
/**
* Sends an HTTP request to Solr.
*
* This is just a wrapper around drupal_http_request().
*/
protected function makeHttpRequest($url, array $options = array()) {
if (empty($options['method']) || $options['method'] == 'GET' || $options['method'] == 'HEAD') {
// Make sure we are not sending a request body.
$options['data'] = NULL;
}
if ($this->http_auth) {
$options['headers']['Authorization'] = $this->http_auth;
}
if ($this->stream_context) {
$options['context'] = $this->stream_context;
}
$result = drupal_http_request($url, $options);
if (!isset($result->code) || $result->code < 0) {
$result->code = 0;
$result->status_message = 'Request failed';
$result->protocol = 'HTTP/1.0';
}
// Additional information may be in the error property.
if (isset($result->error)) {
$result->status_message .= ': ' . check_plain($result->error);
}
if (!isset($result->data)) {
$result->data = '';
$result->response = NULL;
}
else {
$response = json_decode($result->data);
if (is_object($response)) {
foreach ($response as $key => $value) {
$result->$key = $value;
}
}
}
return $result;
}
/**
* Implements SearchApiSolrConnectionInterface::escape().
*/
public static function escape($value, $version = 0) {
$replacements = array();
$specials = array('+', '-', '&&', '||', '!', '(', ')', '{', '}', '[', ']', '^', '"', '~', '*', '?', ':', "\\");
// Solr 4.x introduces regular expressions, making the slash also a special
// character.
if ($version >= 4) {
$specials[] = '/';
}
foreach ($specials as $special) {
$replacements[$special] = "\\$special";
}
return strtr($value, $replacements);
}
/**
* Implements SearchApiSolrConnectionInterface::escapePhrase().
*/
public static function escapePhrase($value) {
$replacements['"'] = '\"';
$replacements["\\"] = "\\\\";
return strtr($value, $replacements);
}
/**
* Implements SearchApiSolrConnectionInterface::phrase().
*/
public static function phrase($value) {
return '"' . self::escapePhrase($value) . '"';
}
/**
* Implements SearchApiSolrConnectionInterface::escapeFieldName().
*/
public static function escapeFieldName($value) {
$value = str_replace(':', '\:', $value);
return $value;
}
/**
* Returns the HTTP URL for a certain servlet on the Solr server.
*
* @param $servlet
* A string path to a Solr request handler.
* @param array $params
* Additional GET parameters to append to the URL.
* @param $added_query_string
* Additional query string to append to the URL.
*
* @return string
*/
protected function constructUrl($servlet, array $params = array(), $added_query_string = NULL) {
// PHP's built in http_build_query() doesn't give us the format Solr wants.
$query_string = $this->httpBuildQuery($params);
if ($query_string) {
$query_string = '?' . $query_string;
if ($added_query_string) {
$query_string = $query_string . '&' . $added_query_string;
}
}
elseif ($added_query_string) {
$query_string = '?' . $added_query_string;
}
return $this->base_url . $servlet . $query_string;
}
/**
* Implements SearchApiSolrConnectionInterface::getBaseUrl().
*/
public function getBaseUrl() {
return $this->base_url;
}
/**
* Implements SearchApiSolrConnectionInterface::setBaseUrl().
*/
public function setBaseUrl($url) {
$this->base_url = $url;
$this->update_url = NULL;
}
/**
* Implements SearchApiSolrConnectionInterface::update().
*/
public function update($rawPost, $timeout = FALSE) {
if (empty($this->update_url)) {
// Store the URL in an instance variable since many updates may be sent
// via a single instance of this class.
$this->update_url = $this->constructUrl(self::UPDATE_SERVLET, array('wt' => 'json'));
}
$options['data'] = $rawPost;
if ($timeout) {
$options['timeout'] = $timeout;
}
return $this->sendRawPost($this->update_url, $options);
}
/**
* Implements SearchApiSolrConnectionInterface::addDocuments().
*/
public function addDocuments(array $documents, $overwrite = NULL, $commitWithin = NULL) {
$attr = '';
if (isset($overwrite)) {
$attr .= ' overwrite="' . ($overwrite ? 'true"' : 'false"');
}
if (isset($commitWithin)) {
$attr .= ' commitWithin="' . ((int) $commitWithin) . '"';
}
$rawPost = "<add$attr>";
foreach ($documents as $document) {
if (is_object($document) && ($document instanceof SearchApiSolrDocument)) {
$rawPost .= $document->toXml();
}
}
$rawPost .= '</add>';
return $this->update($rawPost);
}
/**
* Implements SearchApiSolrConnectionInterface::commit().
*/
public function commit($waitSearcher = TRUE, $timeout = 3600) {
return $this->optimizeOrCommit('commit', $waitSearcher, $timeout);
}
/**
* Implements SearchApiSolrConnectionInterface::deleteById().
*/
public function deleteById($id, $timeout = 3600) {
return $this->deleteByMultipleIds(array($id), $timeout);
}
/**
* Implements SearchApiSolrConnectionInterface::deleteByMultipleIds().
*/
public function deleteByMultipleIds(array $ids, $timeout = 3600) {
$rawPost = '<delete>';
foreach ($ids as $id) {
$rawPost .= '<id>' . htmlspecialchars($id, ENT_NOQUOTES, 'UTF-8') . '</id>';
}
$rawPost .= '</delete>';
return $this->update($rawPost, $timeout);
}
/**
* Implements SearchApiSolrConnectionInterface::deleteByQuery().
*/
public function deleteByQuery($rawQuery, $timeout = 3600) {
$rawPost = '<delete><query>' . htmlspecialchars($rawQuery, ENT_NOQUOTES, 'UTF-8') . '</query></delete>';
return $this->update($rawPost, $timeout);
}
/**
* Implements SearchApiSolrConnectionInterface::optimize().
*/
public function optimize($waitSearcher = TRUE, $timeout = 3600) {
return $this->optimizeOrCommit('optimize', $waitSearcher, $timeout);
}
/**
* Sends an commit or optimize command to the Solr server.
*
* Will be synchronous unless $waitSearcher is set to FALSE.
*
* @param string $type
* Either "commit" or "optimize".
* @param bool $waitSearcher
* (optional) Wait until a new searcher is opened and registered as the main
* query searcher, making the changes visible. Defaults to true.
* @param int $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws SearchApiException
* If an error occurs during the service call.
*/
protected function optimizeOrCommit($type, $waitSearcher = TRUE, $timeout = 3600) {
$waitSearcher = $waitSearcher ? '' : ' waitSearcher="false"';
if ($this->getSolrVersion() <= 3) {
$rawPost = "<$type$waitSearcher />";
}
else {
$softCommit = ($this->soft_commit) ? ' softCommit="true"' : '';
$rawPost = "<$type$waitSearcher$softCommit />";
}
$response = $this->update($rawPost, $timeout);
$this->clearCache();
return $response;
}
/**
* Generates an URL-encoded query string.
*
* Works like PHP's built in http_build_query() (or drupal_http_build_query())
* but uses rawurlencode() and no [] for repeated params, to be compatible
* with the Java-based servers Solr runs on.
*
*
* @param array $query
* The query parameters which should be set.
* @param string $parent
* Internal use only.
*
* @return string
* A query string to append (after "?") to a URL.
*/
protected function httpBuildQuery(array $query, $parent = '') {
$params = array();
foreach ($query as $key => $value) {
$key = ($parent ? $parent : rawurlencode($key));
// Recurse into children.
if (is_array($value)) {
$params[] = $this->httpBuildQuery($value, $key);
}
// If a query parameter value is NULL, only append its key.
elseif (!isset($value)) {
$params[] = $key;
}
else {
$params[] = $key . '=' . rawurlencode($value);
}
}
return implode('&', $params);
}
/**
* {@inheritdoc}
*/
public function search($query = NULL, array $params = array(), $method = 'GET') {
// Always use JSON. See
// http://code.google.com/p/solr-php-client/issues/detail?id=6#c1 for
// reasoning.
$params['wt'] = 'json';
// Additional default params.
$params += array(
'json.nl' => self::NAMED_LIST_FORMAT,
);
if ($query) {
$params['q'] = $query;
}
// PHP's built-in http_build_query() doesn't give us the format Solr wants.
$queryString = $this->httpBuildQuery($params);
if ($method == 'GET' || $method == 'AUTO') {
$searchUrl = $this->constructUrl(self::SEARCH_SERVLET, array(), $queryString);
if ($method == 'GET' || strlen($searchUrl) <= variable_get('search_api_solr_http_get_max_length', 4000)) {
return $this->sendRawGet($searchUrl);
}
}
// Method is POST, or AUTO with a long query
$searchUrl = $this->constructUrl(self::SEARCH_SERVLET);
$options['data'] = $queryString;
$options['headers']['Content-Type'] = 'application/x-www-form-urlencoded; charset=UTF-8';
return $this->sendRawPost($searchUrl, $options);
}
}

View File

@ -0,0 +1,365 @@
<?php
/**
* The interface for a Solr connection class.
*/
interface SearchApiSolrConnectionInterface {
/**
* Constructs a Solr connection objects.
*
* @param array $options
* An array containing construction arguments.
*/
public function __construct(array $options);
/**
* Calls the /admin/ping servlet, to test the connection to the server.
*
* @param int|false $timeout
* Maximum time to wait for ping in seconds, -1 for unlimited (default 2).
*
* @return float|false
* Seconds taken to ping the server, FALSE if timeout occured.
*/
public function ping($timeout = 2);
/**
* Sets whether this connection will use soft commits when comitting.
*
* Note that this setting only has any effect when using Solr 4.x or higher.
*
* @param $soft_commit
* TRUE if soft commits should be used, FALSE otherwise. Default is FALSE.
*/
public function setSoftCommit($soft_commit);
/**
* Tells whether this connection will use soft commits when comitting.
*
* Note that this setting only has any effect when using Solr 4.x or higher.
*
* @return
* TRUE if soft commits will be used, FALSE otherwise.
*/
public function getSoftCommit();
/**
* Set the stream context to use for requests to the Solr server.
*
* Must be a valid stream context as created by stream_context_create(). By
* default, no special stream context will be used.
*
* @param resource|null $stream_context
* A valid stream context as created by stream_context_create(). Or NULL to
* use the default behavior.
*/
public function setStreamContext($stream_context);
/**
* Returns the stream context to use for requests to the Solr server.
*
* By default, no special stream context will be used and this method will
* return NULL.
*
* @return resource|null
* A valid stream context as created by stream_context_create(). Or NULL if
* the default behavior is used.
*/
public function getStreamContext();
/**
* Gets information about the Solr Core.
*
* @return object
* A response object with system information.
*/
public function getSystemInfo();
/**
* Get metadata about fields in the Solr/Lucene index.
*
* @param int $num_terms
* Number of 'top terms' to return.
*
* @return array
* An array of SearchApiSolrField objects.
*/
public function getFields($num_terms = 0);
/**
* Gets meta-data about the index.
*
* @param int $num_terms
* Number of 'top terms' to return.
*
* @return object
* A response object filled with data from Solr's Luke.
*/
public function getLuke($num_terms = 0);
/**
* Gets information about the Solr core.
*
* @return SimpleXMLElement
* A Simple XMl document.
*/
public function getStats();
/**
* Gets summary information about the Solr Core.
*/
public function getStatsSummary();
/**
* Clears the cached Solr data.
*/
public function clearCache();
/**
* Makes a request to a servlet (a path) that's not a standard path.
*
* @param string $servlet
* A path to be added to the base Solr path. e.g. 'extract/tika'.
* @param array $params
* Any request parameters when constructing the URL.
* @param array $options
* Options to be passed to drupal_http_request().
*
* @return object
* The HTTP response object.
*
* @throws Exception
*/
public function makeServletRequest($servlet, array $params = array(), array $options = array());
/**
* Gets the base URL of the Solr server.
*
* @return string
* The base URL of the Solr server.
*/
public function getBaseUrl();
/**
* Sets the base URL of the Solr server.
*
* @param string $url
* The new base URL of the Solr server.
*/
public function setBaseUrl($url);
/**
* Sends a raw update request to the Solr server.
*
* Takes a raw post body and sends it to the update service. Post body should
* be a complete and well-formed XML document.
*
* @param string $rawPost
* The XML document to send to the Solr server's update service.
* @param int|false $timeout
* (optional) Maximum expected duration (in seconds). Defaults to not timing
* out.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call
*/
public function update($rawPost, $timeout = FALSE);
/**
* Adds an array of Solr Documents to the index all at once
*
* @param array $documents
* Should be an array of ApacheSolrDocument instances
* @param bool $overwrite
* (optional) Set whether existing documents with the same IDs should be
* overwritten. Defaults to TRUE.
* @param bool $commitWithin
* (optional) The time in which the indexed documents should be committed to
* the index, in milliseconds. This works in addition to the Solr server's
* auto commit settings. Defaults to no additional handling.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function addDocuments(array $documents, $overwrite = NULL, $commitWithin = NULL);
/**
* Sends a commit command to the Solr server.
*
* Will be synchronous unless $waitSearcher is set to FALSE.
*
* @param bool $waitSearcher
* (optional) Wait until a new searcher is opened and registered as the main
* query searcher, making the changes visible. Defaults to true.
* @param int|false $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function commit($waitSearcher = TRUE, $timeout = 3600);
/**
* Sends a delete request based on a document ID.
*
* @param string $id
* The ID of the document which should be deleted. Expected to be UTF-8
* encoded.
* @param int|false $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function deleteById($id, $timeout = 3600);
/**
* Sends a delete request for several documents, based on the document IDs.
*
* @param array $id
* The IDs of the documents which should be deleted. Expected to be UTF-8
* encoded.
* @param int|false $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function deleteByMultipleIds(array $ids, $timeout = 3600);
/**
* Sends a delete request for all documents that match the given Solr query.
*
* @param string $rawQuery
* The query whose results should be deleted. Expected to be UTF-8 encoded.
* @param int|false $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function deleteByQuery($rawQuery, $timeout = 3600);
/**
* Sends an optimize command to the Solr server.
*
* Will be synchronous unless $waitSearcher is set to FALSE.
*
* @param bool $waitSearcher
* (optional) Wait until a new searcher is opened and registered as the main
* query searcher, making the changes visible. Defaults to true.
* @param int|false $timeout
* Seconds to wait until timing out with an exception. Defaults to an hour.
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function optimize($waitSearcher = TRUE, $timeout = 3600);
/**
* Executes a search on the Solr server.
*
* @param string|null $query
* (optional) The raw query string. Defaults to an empty query.
* @param array $params
* (optional) Key / value pairs for other query parameters (see Solr
* documentation). Use arrays for parameter keys used more than once (e.g.,
* facet.field).
* @param string $method
* The HTTP method to use. Must be either "GET", "POST" or "AUTO". Defaults
* to "GET".
*
* @return object
* A response object.
*
* @throws Exception
* If an error occurs during the service call.
*/
public function search($query = NULL, array $params = array(), $method = 'GET');
/**
* Escapes special characters from a Solr query.
*
* A complete list of special characters in Solr queries can be viewed at
* http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Characters
*
* @param string $value
* The string to escape.
* @param string $version
* An integer representing major solr version release.
*
* @return string
* An escaped string suitable for passing to Solr.
*/
public static function escape($value, $version = 0);
/**
* Escapes a string that should be included in a Solr phrase.
*
* In contrast to escape(), this only escapes '"' and '\'.
*
* @param string $value
* The string to escape.
*
* @return string
* An escaped string suitable for passing to Solr.
*/
public static function escapePhrase($value);
/**
* Converts a string to a Solr phrase.
*
* @param string $value
* The string to convert to a phrase.
*
* @return string
* A phrase string suitable for passing to Solr.
*/
public static function phrase($value);
/**
* Escapes a Search API field name for passing to Solr.
*
* Since field names can only contain one special character, ":", there is no
* need to use the complete escape() method.
*
* @param string $value
* The field name to escape.
*
* @return string
* An escaped string suitable for passing to Solr.
*/
public static function escapeFieldName($value);
/**
* Gets the current solr version.
*
* @return int
* 1, 3 or 4. Does not give a more detailed version, for that you need to
* use getSystemInfo().
*/
public function getSolrVersion();
}

View File

@ -0,0 +1,281 @@
<?php
/**
* Logic around Solr field schema information.
*/
class SearchApiSolrField {
/**
* @var array
* Human-readable labels for Solr schema properties.
*/
public static $schemaLabels = array(
'I' => 'Indexed',
'T' => 'Tokenized',
'S' => 'Stored',
'M' => 'Multivalued',
'V' => 'TermVector Stored',
'o' => 'Store Offset With TermVector',
'p' => 'Store Position With TermVector',
'O' => 'Omit Norms',
'L' => 'Lazy',
'B' => 'Binary',
'C' => 'Compressed',
'f' => 'Sort Missing First',
'l' => 'Sort Missing Last',
);
/**
* @var stdclass
* The original field object.
*/
protected $field;
/**
* @var array
* An array of schema properties for this field. This will be a subset of
* the SearchApiSolrField::schemaLabels array.
*/
protected $schema;
/**
* Constructs a field information object.
*
* @param stdClass $field
* A field object from Solr's "Luke" servlet.
*/
public function __construct($field) {
$this->field = $field;
}
/**
* Gets the raw information of the field.
*
* @return object
* A field metadata object.
*/
public function getRaw() {
return $this->field;
}
/**
* Gets the type of the Solr field, according to the Solr schema.
*
* Note that field types like "text", "boolean", and "date" are conventions,
* but their presence and behavior are entirely determined by the particular
* schema.xml file used by a Solr core.
*
* @return string
* The type of the Solr field.
*/
public function getType() {
return $this->field->type;
}
/**
* Gets an array of field properties.
*
* @return array
* An array of properties describing the solr schema. The array keys are
* single-character codes, and the values are human-readable labels. This
* will be a subset of the SearchApiSolrField::schemaLabels array.
*/
public function getSchema() {
if (!isset($this->schema)) {
foreach (str_split(str_replace('-', '', $this->field->schema)) as $key) {
$this->schema[$key] = isset(self::$schemaLabels[$key]) ? self::$schemaLabels[$key] : $key;
}
}
return $this->schema;
}
/**
* Gets the "dynamic base" of this field.
*
* This typically looks like 'ss_*, and is used to aggregate fields based on
* "hungarian" naming conventions.
*
* @return string
* The mask describing the solr aggregate field, if there is one.
*/
public function getDynamicBase() {
return isset($this->field->dynamicBase) ? $this->field->dynamicBase : NULL;
}
/**
* Determines whether this field may be suitable for use as a key field.
*
* Unfortunately, it seems like the best way to find an actual uniqueKey field
* according to Solr is to examine the Solr core's schema.xml.
*
* @return bool
* Whether the field is suitable for use as a key.
*/
public function isPossibleKey() {
return !$this->getDynamicBase()
&& !in_array($this->getType(), array('boolean', 'date', 'text'))
&& $this->isStored()
&& !$this->isMultivalued();
}
/**
* Determines whether a field is suitable for sorting.
*
* In order for a field to yield useful sorted results in Solr, it must be
* indexed, not multivalued, and not tokenized. It's ok if a field is
* tokenized and yields only one token, but there's no general way to check
* for that.
*
* @return bool
* Whether the field is suitable for sorting.
*/
public function isSortable() {
return $this->isIndexed()
&& !$this->isMultivalued()
&& !$this->isTokenized();
}
/**
* Determines whether this field is indexed.
*
* @return bool
* TRUE if the field is indexed, FALSE otherwise.
*/
public function isIndexed() {
$this->getSchema();
return isset($this->schema['I']);
}
/**
* Determines whether this field is tokenized.
*
* @return bool
* TRUE if the field is tokenized, FALSE otherwise.
*/
public function isTokenized() {
$this->getSchema();
return isset($this->schema['T']);
}
/**
* Determines whether this field is stored.
*
* @return bool
* TRUE if the field is stored, FALSE otherwise.
*/
public function isStored() {
$this->getSchema();
return isset($this->schema['S']);
}
/**
* Determines whether this field is multi-valued.
*
* @return bool
* TRUE if the field is multi-valued, FALSE otherwise.
*/
public function isMultivalued() {
$this->getSchema();
return isset($this->schema['M']);
}
/**
* Determines whether this field has stored term vectors.
*
* @return bool
* TRUE if the field has stored term vectors, FALSE otherwise.
*/
public function isTermVectorStored() {
$this->getSchema();
return isset($this->schema['V']);
}
/**
* Determines whether this field has the "termOffsets" option set.
*
* @return bool
* TRUE if the field has the "termOffsets" option set, FALSE otherwise.
*/
public function isStoreOffsetWithTermVector() {
$this->getSchema();
return isset($this->schema['o']);
}
/**
* Determines whether this field has the "termPositions" option set.
*
* @return bool
* TRUE if the field has the "termPositions" option set, FALSE otherwise.
*/
public function isStorePositionWithTermVector() {
$this->getSchema();
return isset($this->schema['p']);
}
/**
* Determines whether this field omits norms when indexing.
*
* @return bool
* TRUE if the field omits norms, FALSE otherwise.
*/
public function isOmitNorms() {
$this->getSchema();
return isset($this->schema['O']);
}
/**
* Determines whether this field is lazy-loaded.
*
* @return bool
* TRUE if the field is lazy-loaded, FALSE otherwise.
*/
public function isLazy() {
$this->getSchema();
return isset($this->schema['L']);
}
/**
* Determines whether this field is binary.
*
* @return bool
* TRUE if the field is binary, FALSE otherwise.
*/
public function isBinary() {
$this->getSchema();
return isset($this->schema['B']);
}
/**
* Determines whether this field is compressed.
*
* @return bool
* TRUE if the field is compressed, FALSE otherwise.
*/
public function isCompressed() {
$this->getSchema();
return isset($this->schema['C']);
}
/**
* Determines whether this field sorts missing entries first.
*
* @return bool
* TRUE if the field sorts missing entries first, FALSE otherwise.
*/
public function isSortMissingFirst() {
$this->getSchema();
return isset($this->schema['f']);
}
/**
* Determines whether this field sorts missing entries last.
*
* @return bool
* TRUE if the field sorts missing entries last, FALSE otherwise.
*/
public function isSortMissingLast() {
$this->getSchema();
return isset($this->schema['l']);
}
}

View File

@ -0,0 +1,33 @@
<?php
/**
* @file
* Contains the SearchApiSpellcheckSolr class.
*/
/**
* Spellcheck class which can provide spelling suggestions. The constructor
* populates the instance with any suggestions returned by Solr.
*/
class SearchApiSpellcheckSolr extends SearchApiSpellcheck {
/**
* Constructs a SearchApiSpellcheckSolr object.
*
* If Solr has returned spelling suggestion then loop through them and add
* them to this spellcheck service.
*
* @param object $response
* The Solr response object.
*/
function __construct($response) {
if (isset($response->spellcheck->suggestions)) {
$suggestions = $response->spellcheck->suggestions;
foreach ($suggestions as $word => $data) {
foreach ($data->suggestion as $suggestion) {
$this->addSuggestion(new SearchApiSpellcheckSuggestion($word, $suggestion));
}
}
}
}
}

View File

@ -0,0 +1,62 @@
<?php
/**
* @file
* Admin page callbacks for the Search API Solr module.
*/
/**
* Form constructor for the Solr files overview.
*
* @param SearchApiServer $server
* The server for which files should be displayed.
*
* @ingroup forms
*/
function search_api_solr_solr_config_form($form, &$form_state, SearchApiServer $server) {
$form['title']['#markup'] = '<h2>' . t('List of configuration files found:') . '</h2>';
try {
// Retrieve the list of available files.
$files_list = search_api_solr_server_get_files($server);
if (empty($files_list)) {
$form['info']['#markup'] = t('No files found.');
return $form;
}
$form['files'] = array(
'#type' => 'vertical_tabs',
);
// Generate a fieldset for each file.
foreach ($files_list as $file_name => $file_info) {
$file_date = format_date(strtotime($file_info['modified']));
$escaped_file_name = check_plain($file_name);
$form['files'][$file_name] = array(
'#title' => $escaped_file_name,
'#type' => 'fieldset',
);
$data = '<h3>' . $escaped_file_name . '</h3>';
$data .= '<p><em>' . t('Last modified: @time.', array('@time' => $file_date)) . '</em></p>';
if ($file_info['size'] > 0) {
$file_data = $server->getFile($file_name);
$data .= '<pre><code>' . check_plain($file_data->data) . '</code></pre>';
}
else {
$data .= '<p><em>' . t('The file is empty.') . '</em></p>';
}
$form['files'][$file_name]['data']['#markup'] = $data;
}
}
catch (SearchApiException $e) {
watchdog_exception('search_api_solr', $e, '%type while retrieving config files of Solr server @server: !message in %function (line %line of %file).', array('@server' => $server->name));
$form['info']['#markup'] = t('An error occured while trying to load the list of files.');
}
return $form;
}

View File

@ -0,0 +1,150 @@
<?php
/**
* @file
* Hooks provided by the Search API Solr search module.
*/
/**
* @addtogroup hooks
* @{
*/
/**
* Lets modules alter a Solr search request before sending it.
*
* Apache_Solr_Service::search() is called afterwards with these parameters.
* Please see this method for details on what should be altered where and what
* is set afterwards.
*
* @param array $call_args
* An associative array containing all three arguments to the
* SearchApiSolrConnectionInterface::search() call ("query", "params" and
* "method") as references.
* @param SearchApiQueryInterface $query
* The SearchApiQueryInterface object representing the executed search query.
*/
function hook_search_api_solr_query_alter(array &$call_args, SearchApiQueryInterface $query) {
if ($query->getOption('foobar')) {
$call_args['params']['foo'] = 'bar';
}
}
/**
* Change the way the index's field names are mapped to Solr field names.
*
* @param SearchApiIndex $index
* The index whose field mappings are altered.
* @param array $fields
* An associative array containing the index field names mapped to their Solr
* counterparts. The special fields 'search_api_id' and 'search_api_relevance'
* are also included.
*/
function hook_search_api_solr_field_mapping_alter(SearchApiIndex $index, array &$fields) {
if ($index->entity_type == 'node' && isset($fields['body:value'])) {
$fields['body:value'] = 'text';
}
}
/**
* Alter Solr documents before they are sent to Solr for indexing.
*
* @param array $documents
* An array of SearchApiSolrDocument objects ready to be indexed, generated
* from $items array.
* @param SearchApiIndex $index
* The search index for which items are being indexed.
* @param array $items
* An array of items being indexed.
*/
function hook_search_api_solr_documents_alter(array &$documents, SearchApiIndex $index, array $items) {
// Adds a "foo" field with value "bar" to all documents.
foreach ($documents as $document) {
$document->setField('foo', 'bar');
}
}
/**
* Lets modules alter the search results returned from a Solr search.
*
* @param array $results
* The results array that will be returned for the search.
* @param SearchApiQueryInterface $query
* The SearchApiQueryInterface object representing the executed search query.
* @param object $response
* The Solr response object.
*/
function hook_search_api_solr_search_results_alter(array &$results, SearchApiQueryInterface $query, $response) {
if (isset($response->facet_counts->facet_fields->custom_field)) {
// Do something with $results.
}
}
/**
* Lets modules alter a Solr search request for a multi-index search.
*
* SearchApiSolrConnectionInterface::search() is called afterwards with these
* parameters. Please see this method for details on what should be altered
* where and what is set afterwards.
*
* @param array $call_args
* An associative array containing all three arguments to the
* SearchApiSolrConnectionInterface::search() call ("query", "params" and
* "method") as references.
* @param SearchApiMultiQueryInterface $query
* The object representing the executed search query.
*/
function hook_search_api_solr_multi_query_alter(array &$call_args, SearchApiMultiQueryInterface $query) {
if ($query->getOption('foobar')) {
$call_args['params']['foo'] = 'bar';
}
}
/**
* Provide Solr dynamic fields as Search API data types.
*
* This serves as a placeholder for documenting additional keys for
* hook_search_api_data_type_info() which are recognized by this module to
* automatically support dynamic field types from the schema.
*
* @return array
* In addition to the keys for the individual types that are defined by
* hook_search_api_data_type_info(), the following keys are regonized:
* - prefix: The Solr field name prefix to use for this type. Should match
* two existing dynamic fields definitions with names "{PREFIX}s_*" and
* "{PREFIX}m_*".
* - always multiValued: (optional) If TRUE, only the dynamic field name
* prefix (without the "_*" portion) with multiValued="true" should be given
* by "prefix", instead of the common prefix part for both the single-valued
* and the multi-valued field. This should be the case for all fulltext
* fields, since they might already be tokenized by the Search API. Defaults
* to FALSE.
*
*@see hook_search_api_data_type_info()
*/
function search_api_solr_hook_search_api_data_type_info() {
return array(
// You can use any identifier you want here, but it makes sense to use the
// field type name from schema.xml.
'edge_n2_kw_text' => array(
// Stock hook_search_api_data_type_info() info:
'name' => t('Fulltext (w/ partial matching)'),
'fallback' => 'text',
// Dynamic field with name="te_*".
'prefix' => 'te',
// Fulltext types should always be multi-valued.
'always multiValued' => TRUE,
),
'tlong' => array(
// Stock hook_search_api_data_type_info() info:
'name' => t('TrieLong'),
'fallback' => 'integer',
// Dynamic fields with name="its_*" and name="itm_*".
'prefix' => 'it',
),
);
}
/**
* @} End of "addtogroup hooks".
*/

View File

@ -0,0 +1,19 @@
name = Solr search
description = Offers an implementation of the Search API that uses an Apache Solr server for indexing content.
dependencies[] = search_api
core = 7.x
package = Search
files[] = includes/document.inc
files[] = includes/service.inc
files[] = includes/solr_connection.inc
files[] = includes/solr_connection.interface.inc
files[] = includes/solr_field.inc
files[] = includes/spellcheck.inc
; Information added by Drupal.org packaging script on 2013-12-25
version = "7.x-1.4"
core = "7.x"
project = "search_api_solr"
datestamp = "1387970905"

View File

@ -0,0 +1,117 @@
<?php
/**
* Implements hook_schema().
*/
function search_api_solr_schema() {
// See, e.g., block_schema() for this trick. Seems to be the best way to get a
// cache table definition.
$schema['cache_search_api_solr'] = drupal_get_schema_unprocessed('system', 'cache');
$schema['cache_search_api_solr']['description'] = 'Cache table for the Search API Solr module to store various data related to Solr servers.';
return $schema;
}
/**
* Implements hook_requirements().
*/
function search_api_solr_requirements($phase) {
$ret = array();
if ($phase == 'runtime') {
$servers = search_api_server_load_multiple(FALSE, array('class' => 'search_api_solr_service', 'enabled' => TRUE));
$count = 0;
$unavailable = 0;
$last = NULL;
foreach ($servers as $server) {
if (!$server->ping()) {
++$unavailable;
$last = $server;
}
++$count;
}
if (!$count) {
return array();
}
$ret['search_api_solr'] = array(
'title' => t('Solr servers'),
'value' => format_plural($count, '1 server', '@count servers'),
);
if ($unavailable) {
if ($unavailable == 1) {
$ret['search_api_solr']['description'] = t('The Solr server of <a href="!url">%name</a> could not be reached.',
array('!url' => url('admin/config/search/search_api/server/' . $last->machine_name), '%name' => $last->name));
}
else {
$ret['search_api_solr']['description'] = t('@count Solr servers could not be reached.', array('@count' => $unavailable));
}
$ret['search_api_solr']['severity'] = REQUIREMENT_ERROR;
}
else {
$ret['search_api_solr']['description'] = format_plural($count, 'The Solr server could be reached.', 'All @count Solr servers could be reached.');
$ret['search_api_solr']['severity'] = REQUIREMENT_OK;
}
}
return $ret;
}
/**
* Implements hook_uninstall().
*/
function search_api_solr_uninstall() {
variable_del('search_api_solr_last_optimize');
variable_del('search_api_solr_autocomplete_max_occurrences');
variable_del('search_api_solr_index_prefix');
variable_del('search_api_solr_http_get_max_length');
}
/**
* Implements hook_update_dependencies().
*/
function search_api_solr_update_dependencies() {
// This update should run after primary IDs have been changed to machine names in the framework.
$dependencies['search_api_solr'][7101] = array(
'search_api' => 7102,
);
return $dependencies;
}
/**
* Implements transition from using the index IDs to using machine names.
*/
function search_api_solr_update_7101() {
foreach (search_api_server_load_multiple(FALSE, array('class' => 'search_api_solr_service')) as $server) {
if ($server->enabled) {
$server->deleteItems('all');
}
else {
$tasks = variable_get('search_api_tasks', array());
$tasks[$server->machine_name][''] = array('clear all');
variable_set('search_api_tasks', $tasks);
}
$query = db_select('search_api_index', 'i')
->fields('i', array('machine_name'))
->condition('server', $server->machine_name);
db_update('search_api_item')
->fields(array(
'changed' => REQUEST_TIME,
))
->condition('index_id', $query, 'IN')
->execute();
}
return t('The Solr search module was updated. ' .
'Please stop your Solr servers, replace their schema.xml with the new version and then start them again. ' .
'All data indexed on Solr servers will have to be reindexed.');
}
/**
* Create the Search API Solr cache table {cache_search_api_solr}.
*/
function search_api_solr_update_7102() {
if (!db_table_exists('cache_search_api_solr')) {
$table = drupal_get_schema_unprocessed('system', 'cache');
$table['description'] = 'Cache table for the Search API Solr module to store various data related to Solr servers.';
db_create_table('cache_search_api_solr', $table);
}
}

View File

@ -0,0 +1,317 @@
<?php
/**
* @file
* Provides a Solr-based service class for the Search API.
*/
/**
* Implements hook_menu().
*/
function search_api_solr_menu() {
$items['admin/config/search/search_api/server/%search_api_server/files'] = array(
'title' => 'Files',
'description' => 'View Solr configuration files.',
'page callback' => 'drupal_get_form',
'page arguments' => array('search_api_solr_solr_config_form', 5),
'access callback' => 'search_api_solr_access_server_files',
'access arguments' => array(5),
'file' => 'search_api_solr.admin.inc',
'type' => MENU_LOCAL_TASK,
'weight' => -1,
'context' => MENU_CONTEXT_INLINE | MENU_CONTEXT_PAGE,
);
return $items;
}
/**
* Implements hook_search_api_service_info().
*/
function search_api_solr_search_api_service_info() {
$variables = array(
'@solr_wiki_url' => url('http://wiki.apache.org/solr/SolrQuerySyntax'),
'@readme_url' => url(drupal_get_path('module', 'search_api_solr') . '/README.txt'),
);
$services['search_api_solr_service'] = array(
'name' => t('Solr service'),
'description' => t('<p>Index items using an Apache Solr search server.</p>
<ul>
<li>See <a href="@solr_wiki_url">the Solr wiki</a> for information about the "direct" parse mode.</li>
<li>Will use internal Solr preprocessors, so Search API preprocessors should for the most part be deactivated.</li>
<li>See the <a href="@readme_url">README.txt</a> file provided with this module for details.</li>
</ul>', $variables),
'class' => 'SearchApiSolrService',
);
return $services;
}
/**
* Implements hook_help().
*/
function search_api_solr_help($path, array $arg = array()) {
if ($path == 'admin/config/search/search_api') {
// Included because we need the REQUIREMENT_* constants.
include_once(DRUPAL_ROOT . '/includes/install.inc');
module_load_include('install', 'search_api_solr');
$reqs = search_api_solr_requirements('runtime');
foreach ($reqs as $req) {
if (isset($req['description'])) {
$type = $req['severity'] == REQUIREMENT_ERROR ? 'error' : ($req['severity'] == REQUIREMENT_WARNING ? 'warning' : 'status');
drupal_set_message($req['description'], $type);
}
}
}
}
/**
* Implements hook_cron().
*
* Used to execute an optimization operation on all enabled Solr servers once a
* day.
*/
function search_api_solr_cron() {
if (REQUEST_TIME - variable_get('search_api_solr_last_optimize', 0) > 86400) {
variable_set('search_api_solr_last_optimize', REQUEST_TIME);
$conditions = array('class' => 'search_api_solr_service', 'enabled' => TRUE);
foreach (search_api_server_load_multiple(FALSE, $conditions) as $server) {
try {
$server->getSolrConnection()->optimize(FALSE);
}
catch(Exception $e) {
watchdog_exception('search_api_solr', $e, '%type while optimizing Solr server @server: !message in %function (line %line of %file).', array('@server' => $server->name));
}
}
}
}
/**
* Implements hook_flush_caches().
*/
function search_api_solr_flush_caches() {
return array('cache_search_api_solr');
}
/**
* Implements hook_search_api_server_update().
*/
function search_api_solr_search_api_server_update(SearchApiServer $server) {
if ($server->class === 'search_api_solr_service') {
$server->getSolrConnection()->clearCache();
}
}
/**
* Implements hook_views_api().
*/
function search_api_solr_views_api() {
if (module_exists('search_api_views')) {
return array(
'api' => 3,
);
}
}
/**
* Retrieves Solr-specific data for available data types.
*
* Returns the data type information for both the default Search API data types
* and custom data types defined by hook_search_api_data_type_info(). Names for
* default data types are not included, since they are not relevant to the Solr
* service class.
*
* We're adding some extra Solr field information for the default search api
* data types (as well as on behalf of a couple contrib field types). The
* extra information we're adding is documented in
* search_api_solr_hook_search_api_data_type_info(). You can use the same
* additional keys in hook_search_api_data_type_info() to support custom
* dynamic fields in your indexes with Solr.
*
* @param string|null $type
* (optional) A specific type for which the information should be returned.
* Defaults to returning all information.
*
* @return array|null
* If $type was given, information about that type or NULL if it is unknown.
* Otherwise, an array of all types. The format in both cases is the same as
* for search_api_get_data_type_info().
*
* @see search_api_get_data_type_info()
* @see search_api_solr_hook_search_api_data_type_info()
*/
function search_api_solr_get_data_type_info($type = NULL) {
$types = &drupal_static(__FUNCTION__);
if (!isset($types)) {
// Grab the stock search_api data types.
$types = search_api_get_data_type_info();
// Add our extras for the default search api fields.
$types += array(
'text' => array(
'prefix' => 'tm',
'always multiValued' => TRUE,
),
'string' => array(
'prefix' => 's',
),
'integer' => array(
'prefix' => 'i',
),
'decimal' => array(
'prefix' => 'f',
),
'date' => array(
'prefix' => 'd',
),
'duration' => array(
'prefix' => 'i',
),
'boolean' => array(
'prefix' => 'b',
),
'uri' => array(
'prefix' => 's',
),
'tokens' => array(
'prefix' => 'tm',
'always multiValued' => TRUE,
),
);
// Extra data type info.
$extra_types_info = array(
'location' => array(
'prefix' => 'loc',
),
'geohash' => array(
'prefix' => 'geo',
),
);
// For the extra types, only add our extra info if it's already been defined.
foreach ($extra_types_info as $key => $info) {
if (array_key_exists($key, $types)) {
// Merge our extras into the data type info
$types[$key] += $info;
}
}
}
// Return the info.
if (isset($type)) {
return isset($types[$type]) ? $types[$type] : NULL;
}
return $types;
}
/**
* Retrieves a list of all config files of a server.
*
* @param SearchApiServer $server
* The Solr server whose files should be retrieved.
* @param string $dir_name
* (optional) The directory that should be searched for files. Defaults to the
* root config directory.
*
* @return array
* An associative array of all config files in the given directory. The keys
* are the file names, values are arrays with information about the file. The
* files are returned in alphabetical order and breadth-first.
*
* @throws SearchApiException
* If a problem occurred while retrieving the files.
*/
function search_api_solr_server_get_files(SearchApiServer $server, $dir_name = NULL) {
$response = $server->getFile($dir_name);
// Search for directories and recursively merge directory files.
$files_data = json_decode($response->data, TRUE);
$files_list = $files_data['files'];
$result = array('' => array());
foreach ($files_list as $file_name => $file_info) {
if (empty($file_info['directory'])) {
$result[''][$file_name] = $file_info;
}
else {
$result[$file_name] = search_api_solr_server_get_files($server, $file_name);
}
}
ksort($result);
ksort($result['']);
return array_reduce($result, 'array_merge', array());
}
/**
* @deprecated
*
* @see search_api_solr_access_server_files()
*/
function search_api_access_server_files(SearchApiServer $server) {
return search_api_solr_access_server_files($server);
}
/**
* Access callback for a server's "Files" tab.
*
* Grants access if the user has the "administer search_api" permission and the
* server is a Solr server.
*
* @param SearchApiServer $server
* The server for which access should be tested.
*
* @return bool
* TRUE if access should be granted, FALSE otherwise.
*/
function search_api_solr_access_server_files(SearchApiServer $server) {
if (!user_access('administer search_api')) {
return FALSE;
}
$service_info = search_api_get_service_info($server->class);
$service_class = $service_info['class'];
if (empty($service_class) || !class_exists($service_class)) {
// Service class not found.
return FALSE;
}
if ($service_class == 'SearchApiSolrService' || in_array('SearchApiSolrService', class_parents($service_class))) {
// It's an SearchApiSolrService based connection class.
return TRUE;
}
return FALSE;
}
/**
* Switches a server to use clean identifiers.
*
* Used as a submit callback in SearchApiSolrService::configurationForm().
*/
function _search_api_solr_switch_to_clean_ids(array $form, array &$form_state) {
$server = $form_state['server'];
$server->options['clean_ids'] = TRUE;
$server->save();
drupal_set_message(t('The Solr server was successfully switched to use clean field identifiers.'));
$count = 0;
$conditions['server'] = $server->machine_name;
$conditions['enabled'] = 1;
foreach (search_api_index_load_multiple(FALSE, $conditions) as $index) {
if (!empty($index->options['fields'])) {
foreach ($index->options['fields'] as $key => $field) {
if (strpos($key, ':') !== FALSE) {
$index->reindex();
++$count;
break;
}
}
}
}
if ($count) {
$msg = format_plural($count, '1 index was scheduled for re-indexing.', '@count indexes were scheduled for re-indexing.');
drupal_set_message($msg);
}
}

View File

@ -0,0 +1,62 @@
<?php
/**
* @file
* Views integration code for the Search API Solr module.
*/
/**
* Implements hook_views_data_alter().
*
* Adds field handlers for "virtual" fields, if the index's Solr server has
* "Retrieve results data from Solr" enabled.
*/
function search_api_solr_views_data_alter(array &$data) {
try {
foreach (search_api_index_load_multiple(FALSE) as $index) {
$server = $index->server();
if (!$server || empty($server->options['retrieve_data'])) {
return;
}
// Fill in base data.
$key = 'search_api_index_' . $index->machine_name;
$table = & $data[$key];
try {
$wrapper = $index->entityWrapper(NULL, FALSE);
}
catch (EntityMetadataWrapperException $e) {
watchdog_exception('search_api_solr', $e, "%type while retrieving metadata for index %index: !message in %function (line %line of %file).", array('%index' => $index->name), WATCHDOG_WARNING);
continue;
}
// Remember fields that aren't added by data alterations, etc. (since
// there isn't any other way to tell them apart).
$normal_fields = array();
foreach ($wrapper as $key => $property) {
$normal_fields[$key] = TRUE;
}
try {
$wrapper = $index->entityWrapper(NULL);
}
catch (EntityMetadataWrapperException $e) {
watchdog_exception('search_api_solr', $e, "%type while retrieving metadata for index %index: !message in %function (line %line of %file).", array('%index' => $index->name), WATCHDOG_WARNING);
continue;
}
// Add field handlers for items added by data alterations, etc.
foreach ($wrapper as $key => $property) {
if (empty($normal_fields[$key])) {
$info = $property->info();
if ($info) {
entity_views_field_definition($key, $info, $table);
}
}
}
}
}
catch (Exception $e) {
watchdog_exception('search_api_views', $e);
}
}

View File

@ -0,0 +1,31 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This file allows you to boost certain search items to the top of search
results. You can find out an item's ID by searching directly on the Solr
server. The item IDs are in general constructed as follows:
Search API:
$document->id = $index_id . '-' . $item_id;
Apache Solr Search Integration:
$document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
If you want this file to be automatically re-loaded when a Solr commit takes
place (e.g., if you have an automatic script active which updates elevate.xml
according to newly-indexed data), place it into Solr's data/ directory.
Otherwise, place it with the other configuration files into the conf/
directory.
See http://wiki.apache.org/solr/QueryElevationComponent for more information.
-->
<elevate>
<!-- Example for ranking the node #1 first in searches for "example query": -->
<!--
<query text="example query">
<doc id="default_node_index-1" />
<doc id="7v3jsc/node/1" />
</query>
-->
<!-- Multiple <query> elements can be specified, contained in one <elevate>. -->
<!-- <query text="...">...</query> -->
</elevate>

View File

@ -0,0 +1,14 @@
# This file contains character mappings for the default fulltext field type.
# The source characters (on the left) will be replaced by the respective target
# characters before any other processing takes place.
# Lines starting with a pound character # are ignored.
#
# For sensible defaults, use the mapping-ISOLatin1Accent.txt file distributed
# with the example application of your Solr version.
#
# Examples:
# "À" => "A"
# "\u00c4" => "A"
# "\u00c4" => "\u0041"
# "æ" => "ae"
# "\n" => " "

View File

@ -0,0 +1,7 @@
#-----------------------------------------------------------------------
# This file blocks words from being operated on by the stemmer and word delimiter.
&amp;
&lt;
&gt;
&#039;
&quot;

View File

@ -0,0 +1,535 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This is the Solr schema file. This file should be named "schema.xml" and
should be in the conf directory under the solr home
(i.e. ./solr/conf/schema.xml by default)
or located where the classloader for the Solr webapp can find it.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
-->
<schema name="drupal-4.1-solr-1.4" version="1.2">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
Applications should change this to reflect the nature of the search collection.
version="1.2" is Solr's version number for the schema syntax and semantics. It should
not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields.
-->
<types>
<!-- field type definitions. The "name" attribute is
just a label to be used by field definitions. The "class"
attribute and any other attributes determine the real
behavior of the fieldType.
Class names starting with "solr" refer to java classes in the
org.apache.solr.analysis package.
-->
<!-- The StrField type is not analyzed, but indexed/stored verbatim.
- StrField and TextField support an optional compressThreshold which
limits compression (if enabled in the derived fields) to values which
exceed a certain size (in characters).
-->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings -->
<fieldtype name="binary" class="solr.BinaryField"/>
<!-- The optional sortMissingLast and sortMissingFirst attributes are
currently supported on types that are sorted internally as strings.
- If sortMissingLast="true", then a sort on this field will cause documents
without the field to come after documents with the field,
regardless of the requested sort order (asc or desc).
- If sortMissingFirst="true", then a sort on this field will cause documents
without the field to come before documents with the field,
regardless of the requested sort order.
- If sortMissingLast="false" and sortMissingFirst="false" (the default),
then default lucene sorting will be used which places docs without the
field first in an ascending sort and last in a descending sort.
-->
<!-- numeric field types that can be sorted, but are not optimized for range queries -->
<fieldType name="integer" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<!--
Note:
These should only be used for compatibility with existing indexes (created with older Solr versions)
or if "sortMissingFirst" or "sortMissingLast" functionality is needed. Use Trie based fields instead.
Numeric field types that manipulate the value into
a string value that isn't human-readable in its internal form,
but with a lexicographic ordering the same as the numeric ordering,
so that range queries work correctly.
-->
<fieldType name="sint" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.TrieFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.TrieLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/>
<!--
Numeric field types that index each value at various levels of precision
to accelerate range queries when the number of values between the range
endpoints is large. See the javadoc for NumericRangeQuery for internal
implementation details.
Smaller precisionStep values (specified in bits) will lead to more tokens
indexed per value, slightly larger index size, and faster range queries.
A precisionStep of 0 disables indexing at different precision levels.
-->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<!--
The ExternalFileField type gets values from an external file instead of the
index. This is useful for data such as rankings that might change frequently
and require different update frequencies than the documents they are
associated with.
-->
<fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/>
<fieldType name="file" keyField="id" defVal="1" stored="false" indexed="false" class="solr.ExternalFileField" valType="pfloat"/>
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime
The trailing "Z" designates UTC time and is mandatory.
Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory.
Expressions can also be used to denote calculations that should be
performed relative to "NOW" to determine the value, ie...
NOW/HOUR
... Round to the start of the current hour
NOW-1DAY
... Exactly 1 day prior to now
NOW/DAY+6MONTHS+3DAYS
... 6 months and 3 days in the future from the start of
the current day
Consult the DateField javadocs for more information.
-->
<fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>
<!-- solr.TextField allows the specification of custom text analyzers
specified as a tokenizer and a list of token filters. Different
analyzers may be specified for indexing and querying.
The optional positionIncrementGap puts space between multiple fields of
this type on the same document, with the purpose of preventing false phrase
matching across fields.
For more info on customizing your analyzer chain, please see
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
-->
<!-- One can also specify an existing Analyzer class that has a
default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
-->
<!-- A text field that only splits on whitespace for exact matching of words -->
<fieldType name="text_ws" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- A text field that uses WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and non-alphanumeric chars,
so that a query of "wifi" or "wi fi" could match a document containing "Wi-Fi".
Synonyms and stopwords are customized by external files, and stemming is enabled.
Duplicate tokens at the same position (which may result from Stemmed Synonyms or
WordDelim parts) are removed.
-->
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!-- An unstemmed text field - good if one does not know the language of the field -->
<fieldType name="text_und" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Edge N gram type - for example for matching against queries with results
KeywordTokenizer leaves input string intact as a single term.
see: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
-->
<fieldType name="edge_n2_kw_text" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Setup simple analysis for spell checking -->
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LengthFilterFactory" min="4" max="20" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
<!-- This is an example of using the KeywordTokenizer along
With various TokenFilterFactories to produce a sortable field
that does not include some properties of the source text
-->
<fieldType name="sortString" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<!-- KeywordTokenizer does no actual tokenizing, so the entire
input string is preserved as a single token
-->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- The LowerCase TokenFilter does what you expect, which can be
when you want your sorting to be case insensitive
-->
<filter class="solr.LowerCaseFilterFactory" />
<!-- The TrimFilter removes any leading or trailing whitespace -->
<filter class="solr.TrimFilterFactory" />
<!-- The PatternReplaceFilter gives you the flexibility to use
Java Regular expression to replace any sequence of characters
matching a pattern with an arbitrary replacement string,
which may include back refrences to portions of the orriginal
string matched by the pattern.
See the Java Regular Expression documentation for more
infomation on pattern and replacement string syntax.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
<filter class="solr.PatternReplaceFilterFactory"
pattern="(^\p{Punct}+)" replacement="" replace="all"
/>
-->
</analyzer>
</fieldType>
<!-- A random sort type -->
<fieldType name="rand" class="solr.RandomSortField" indexed="true" />
<!-- since fields of this type are by default not stored or indexed, any data added to
them will be ignored outright
-->
<fieldtype name="ignored" stored="false" indexed="false" class="solr.StrField" />
</types>
<!-- Following is a dynamic way to include other types, added by other contrib modules -->
<xi:include href="solr/conf/schema_extra_types.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<fields>
<!-- Valid attributes for fields:
name: mandatory - the name for the field
type: mandatory - the name of a previously defined type from the <types> section
indexed: true if this field should be indexed (searchable or sortable)
stored: true if this field should be retrievable
compressed: [false] if this field should be stored using gzip compression
(this will only apply if the field type is compressable; among
the standard field types, only TextField and StrField are)
multiValued: true if this field may contain multiple values per document
omitNorms: (expert) set to true to omit the norms associated with
this field (this disables length normalization and index-time
boosting for the field, and saves some memory). Only full-text
fields or fields that need an index-time boost need norms.
-->
<!-- The document id is usually derived from a site-spcific key (hash) and the
entity type and ID like:
Search Api :
The format used is $document->id = $index_id . '-' . $item_id
Apache Solr Search Integration
The format used is $document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
-->
<field name="id" type="string" indexed="true" stored="true" required="true" />
<!-- Search Api specific fields -->
<!-- item_id contains the entity ID, e.g. a node's nid. -->
<field name="item_id" type="string" indexed="true" stored="true" />
<!-- index_id is the machine name of the search index this entry belongs to. -->
<field name="index_id" type="string" indexed="true" stored="true" />
<!-- Since sorting by ID is explicitly allowed, store item_id also in a sortable way. -->
<copyField source="item_id" dest="sort_search_api_id" />
<!-- Apache Solr Search Integration specific fields -->
<!-- entity_id is the numeric object ID, e.g. Node ID, File ID -->
<field name="entity_id" type="long" indexed="true" stored="true" />
<!-- entity_type is 'node', 'file', 'user', or some other Drupal object type -->
<field name="entity_type" type="string" indexed="true" stored="true" />
<!-- bundle is a node type, or as appropriate for other entity types -->
<field name="bundle" type="string" indexed="true" stored="true"/>
<field name="bundle_name" type="string" indexed="true" stored="true"/>
<field name="site" type="string" indexed="true" stored="true"/>
<field name="hash" type="string" indexed="true" stored="true"/>
<field name="url" type="string" indexed="true" stored="true"/>
<!-- label is the default field for a human-readable string for this entity (e.g. the title of a node) -->
<field name="label" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- The string version of the title is used for sorting -->
<copyField source="label" dest="sort_label"/>
<!-- content is the default field for full text search - dump crap here -->
<field name="content" type="text" indexed="true" stored="true" termVectors="true"/>
<field name="teaser" type="text" indexed="false" stored="true"/>
<field name="path" type="string" indexed="true" stored="true"/>
<field name="path_alias" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- These are the fields that correspond to a Drupal node. The beauty of having
Lucene store title, body, type, etc., is that we retrieve them with the search
result set and don't need to go to the database with a node_load. -->
<field name="tid" type="long" indexed="true" stored="true" multiValued="true"/>
<field name="taxonomy_names" type="text" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<!-- Copy terms to a single field that contains all taxonomy term names -->
<copyField source="tm_vid_*" dest="taxonomy_names"/>
<!-- Here, default is used to create a "timestamp" field indicating
when each document was indexed.-->
<field name="timestamp" type="tdate" indexed="true" stored="true" default="NOW" multiValued="false"/>
<!-- This field is used to build the spellchecker index -->
<field name="spell" type="textSpell" indexed="true" stored="true" multiValued="true"/>
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<copyField source="label" dest="spell"/>
<copyField source="content" dest="spell"/>
<copyField source="ts_*" dest="spell"/>
<copyField source="tm_*" dest="spell"/>
<!-- Dynamic field definitions. If a field name is not found, dynamicFields
will be used if the name matches any of the patterns.
RESTRICTION: the glob-like pattern in the name attribute must have
a "*" only at the start or the end.
EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i)
Longer patterns will be matched first. if equal size patterns
both match, the first appearing in the schema will be used. -->
<!-- A set of fields to contain text extracted from HTML tag contents which we
can boost at query time. -->
<dynamicField name="tags_*" type="text" indexed="true" stored="false" omitNorms="true"/>
<!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and
the last letter is 's' for single valued, 'm' for multi-valued -->
<!-- We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/>
<!-- List of floats can be saved in a regular float field -->
<dynamicField name="fs_*" type="float" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fm_*" type="float" indexed="true" stored="true" multiValued="true"/>
<!-- List of doubles can be saved in a regular double field -->
<dynamicField name="ps_*" type="double" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pm_*" type="double" indexed="true" stored="true" multiValued="true"/>
<!-- List of booleans can be saved in a regular boolean field -->
<dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/>
<!-- Regular text (without processing) can be stored in a string field-->
<dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- Normal text fields are for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- Unstemmed text fields for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="tus_*" type="text_und" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tum_*" type="text_und" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- These text fields omit norms - useful for extracted text like taxonomy_names -->
<dynamicField name="tos_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
<!-- Special-purpose text fields -->
<dynamicField name="tes_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true" />
<dynamicField name="tem_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="true" omitTermFreqAndPositions="true" />
<dynamicField name="tws_*" type="text_ws" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="twm_*" type="text_ws" indexed="true" stored="true" multiValued="true"/>
<!-- trie dates are preferred, so give them the 2 letter prefix -->
<dynamicField name="ds_*" type="tdate" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="dm_*" type="tdate" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="its_*" type="tlong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="itm_*" type="tlong" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="fts_*" type="tfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ftm_*" type="tfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pts_*" type="tdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ptm_*" type="tdouble" indexed="true" stored="true" multiValued="true"/>
<!-- Binary fields can be populated using base64 encoded data. Useful e.g. for embedding
a small image in a search result using the data URI scheme -->
<dynamicField name="xs_*" type="binary" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="xm_*" type="binary" indexed="false" stored="true" multiValued="true"/>
<!-- In rare cases a date rather than tdate is needed for sortMissingLast -->
<dynamicField name="dds_*" type="date" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ddm_*" type="date" indexed="true" stored="true" multiValued="true"/>
<!-- Sortable fields, good for sortMissingLast support &
We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="iss_*" type="slong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ism_*" type="slong" indexed="true" stored="true" multiValued="true"/>
<!-- In rare cases a sfloat rather than tfloat is needed for sortMissingLast -->
<dynamicField name="fss_*" type="sfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fsm_*" type="sfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pss_*" type="sdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="psm_*" type="sdouble" indexed="true" stored="true" multiValued="true"/>
<!-- In case a 32 bit int is really needed, we provide these fields. 'h' is mnemonic for 'half word', i.e. 32 bit on 64 arch -->
<dynamicField name="hs_*" type="integer" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hm_*" type="integer" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hss_*" type="sint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hsm_*" type="sint" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hts_*" type="tint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="htm_*" type="tint" indexed="true" stored="true" multiValued="true"/>
<!-- Unindexed string fields that can be used to store values that won't be searchable -->
<dynamicField name="zs_*" type="string" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="zm_*" type="string" indexed="false" stored="true" multiValued="true"/>
<!-- Begin compatibility code for added fields in Solr 3.4+
http://wiki.apache.org/solr/SpatialSearch#geodist_-_The_distance_function -->
<dynamicField name="points_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pointm_*" type="string" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="locs_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="locm_*" type="string" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="geos_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="geom_*" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- External file fields -->
<dynamicField name="eff_*" type="string"/>
<!-- End compatibility code -->
<!-- Sortable version of the dynamic string field -->
<dynamicField name="sort_*" type="sortString" indexed="true" stored="false"/>
<copyField source="ss_*" dest="sort_*"/>
<!-- A random sort field -->
<dynamicField name="random_*" type="rand" indexed="true" stored="true"/>
<!-- This field is used to store access information (e.g. node access grants), as opposed to field data -->
<dynamicField name="access_*" type="integer" indexed="true" stored="false" multiValued="true"/>
<!-- The following causes solr to ignore any fields that don't already match an existing
field name or dynamic field, rather than reporting them as an error.
Alternately, change the type="ignored" to some other type e.g. "text" if you want
unknown fields indexed and/or stored by default -->
<dynamicField name="*" type="ignored" multiValued="true" />
</fields>
<!-- Following is a dynamic way to include other fields, added by other contrib modules -->
<xi:include href="solr/conf/schema_extra_fields.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<!-- Field to use to determine and enforce document uniqueness.
Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>content</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
</schema>

View File

@ -0,0 +1,23 @@
<fields>
<!--
Adding German dynamic field types to our Solr Schema
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
This also requires to enable the content in schema_extra_types.xml
-->
<!--
<field name="label_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="content_de" type="text_de" indexed="true" stored="true" termVectors="true"/>
<field name="teaser_de" type="text_de" indexed="false" stored="true"/>
<field name="path_alias_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="taxonomy_names_de" type="text_de" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<field name="spell_de" type="text_de" indexed="true" stored="true" multiValued="true"/>
<copyField source="label_de" dest="spell_de"/>
<copyField source="content_de" dest="spell_de"/>
<dynamicField name="tags_de_*" type="text_de" indexed="true" stored="false" omitNorms="true"/>
<dynamicField name="ts_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<dynamicField name="tos_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
-->
</fields>

View File

@ -0,0 +1,30 @@
<types>
<!--
Adding German language to our Solr Schema German
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
-->
<!--
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="1" catenateNumbers="1" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_de.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="0" catenateNumbers="0" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
-->
</types>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,80 @@
<!-- Spell Check
The spell check component can return a list of alternative spelling
suggestions.
http://wiki.apache.org/solr/SpellCheckComponent
-->
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<!-- Multiple "Spell Checkers" can be declared and used by this
component
-->
<!-- a spellchecker built from a field of the main index, and
written to disk
-->
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="spellcheckIndexDir">spellchecker</str>
<str name="buildOnOptimize">true</str>
<!-- uncomment this to require terms to occur in 1% of the documents in order to be included in the dictionary
<float name="thresholdTokenFrequency">.01</float>
-->
</lst>
<!--
Adding German spellhecker index to our Solr index
This also requires to enable the content in schema_extra_types.xml and schema_extra_fields.xml
-->
<!--
<lst name="spellchecker">
<str name="name">spellchecker_de</str>
<str name="field">spell_de</str>
<str name="spellcheckIndexDir">./spellchecker_de</str>
<str name="buildOnOptimize">true</str>
</lst>
-->
<!-- a spellchecker that uses a different distance measure -->
<!--
<lst name="spellchecker">
<str name="name">jarowinkler</str>
<str name="field">spell</str>
<str name="distanceMeasure">
org.apache.lucene.search.spell.JaroWinklerDistance
</str>
<str name="spellcheckIndexDir">spellcheckerJaro</str>
</lst>
-->
<!-- a spellchecker that use an alternate comparator
comparatorClass be one of:
1. score (default)
2. freq (Frequency first, then score)
3. A fully qualified class name
-->
<!--
<lst name="spellchecker">
<str name="name">freq</str>
<str name="field">lowerfilt</str>
<str name="spellcheckIndexDir">spellcheckerFreq</str>
<str name="comparatorClass">freq</str>
<str name="buildOnCommit">true</str>
-->
<!-- A spellchecker that reads the list of words from a file -->
<!--
<lst name="spellchecker">
<str name="classname">solr.FileBasedSpellChecker</str>
<str name="name">file</str>
<str name="sourceLocation">spellings.txt</str>
<str name="characterEncoding">UTF-8</str>
<str name="spellcheckIndexDir">spellcheckerFile</str>
</lst>
-->
</searchComponent>

View File

@ -0,0 +1,10 @@
# Defines Solr properties for this specific core.
solr.replication.master=false
solr.replication.slave=false
solr.replication.pollInterval=00:00:60
solr.replication.masterUrl=http://localhost:8983/solr
solr.replication.confFiles=schema.xml,mapping-ISOLatin1Accent.txt,protwords.txt,stopwords.txt,synonyms.txt,elevate.xml
solr.mlt.timeAllowed=2000
solr.pinkPony.timeAllowed=-1
solr.autoCommit.MaxDocs=10000
solr.autoCommit.MaxTime=120000

View File

@ -0,0 +1,4 @@
# Contains words which shouldn't be indexed for fulltext fields, e.g., because
# they're to common. For documentation of the format, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StopFilterFactory
# (Lines starting with a pound character # are ignored.)

View File

@ -0,0 +1,3 @@
# Contains synonyms to use for your index. For the format used, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
# (Lines starting with a pound character # are ignored.)

View File

@ -0,0 +1,31 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This file allows you to boost certain search items to the top of search
results. You can find out an item's ID by searching directly on the Solr
server. The item IDs are in general constructed as follows:
Search API:
$document->id = $index_id . '-' . $item_id;
Apache Solr Search Integration:
$document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
If you want this file to be automatically re-loaded when a Solr commit takes
place (e.g., if you have an automatic script active which updates elevate.xml
according to newly-indexed data), place it into Solr's data/ directory.
Otherwise, place it with the other configuration files into the conf/
directory.
See http://wiki.apache.org/solr/QueryElevationComponent for more information.
-->
<elevate>
<!-- Example for ranking the node #1 first in searches for "example query": -->
<!--
<query text="example query">
<doc id="default_node_index-1" />
<doc id="7v3jsc/node/1" />
</query>
-->
<!-- Multiple <query> elements can be specified, contained in one <elevate>. -->
<!-- <query text="...">...</query> -->
</elevate>

View File

@ -0,0 +1,14 @@
# This file contains character mappings for the default fulltext field type.
# The source characters (on the left) will be replaced by the respective target
# characters before any other processing takes place.
# Lines starting with a pound character # are ignored.
#
# For sensible defaults, use the mapping-ISOLatin1Accent.txt file distributed
# with the example application of your Solr version.
#
# Examples:
# "À" => "A"
# "\u00c4" => "A"
# "\u00c4" => "\u0041"
# "æ" => "ae"
# "\n" => " "

View File

@ -0,0 +1,7 @@
#-----------------------------------------------------------------------
# This file blocks words from being operated on by the stemmer and word delimiter.
&amp;
&lt;
&gt;
&#039;
&quot;

View File

@ -0,0 +1,546 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This is the Solr schema file. This file should be named "schema.xml" and
should be in the conf directory under the solr home
(i.e. ./solr/conf/schema.xml by default)
or located where the classloader for the Solr webapp can find it.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
-->
<schema name="drupal-4.2-solr-3.x" version="1.3">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
Applications should change this to reflect the nature of the search collection.
version="1.2" is Solr's version number for the schema syntax and semantics. It should
not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields.
1.3: removed optional field compress feature
-->
<types>
<!-- field type definitions. The "name" attribute is
just a label to be used by field definitions. The "class"
attribute and any other attributes determine the real
behavior of the fieldType.
Class names starting with "solr" refer to java classes in the
org.apache.solr.analysis package.
-->
<!-- The StrField type is not analyzed, but indexed/stored verbatim.
- StrField and TextField support an optional compressThreshold which
limits compression (if enabled in the derived fields) to values which
exceed a certain size (in characters).
-->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings -->
<fieldtype name="binary" class="solr.BinaryField"/>
<!-- The optional sortMissingLast and sortMissingFirst attributes are
currently supported on types that are sorted internally as strings.
- If sortMissingLast="true", then a sort on this field will cause documents
without the field to come after documents with the field,
regardless of the requested sort order (asc or desc).
- If sortMissingFirst="true", then a sort on this field will cause documents
without the field to come before documents with the field,
regardless of the requested sort order.
- If sortMissingLast="false" and sortMissingFirst="false" (the default),
then default lucene sorting will be used which places docs without the
field first in an ascending sort and last in a descending sort.
-->
<!-- numeric field types that can be sorted, but are not optimized for range queries -->
<fieldType name="integer" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<!--
Note:
These should only be used for compatibility with existing indexes (created with older Solr versions)
or if "sortMissingFirst" or "sortMissingLast" functionality is needed. Use Trie based fields instead.
Numeric field types that manipulate the value into
a string value that isn't human-readable in its internal form,
but with a lexicographic ordering the same as the numeric ordering,
so that range queries work correctly.
-->
<fieldType name="sint" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.TrieFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.TrieLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/>
<!--
Numeric field types that index each value at various levels of precision
to accelerate range queries when the number of values between the range
endpoints is large. See the javadoc for NumericRangeQuery for internal
implementation details.
Smaller precisionStep values (specified in bits) will lead to more tokens
indexed per value, slightly larger index size, and faster range queries.
A precisionStep of 0 disables indexing at different precision levels.
-->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<!--
The ExternalFileField type gets values from an external file instead of the
index. This is useful for data such as rankings that might change frequently
and require different update frequencies than the documents they are
associated with.
-->
<fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/>
<fieldType name="file" keyField="id" defVal="1" stored="false" indexed="false" class="solr.ExternalFileField" valType="pfloat"/>
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime
The trailing "Z" designates UTC time and is mandatory.
Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory.
Expressions can also be used to denote calculations that should be
performed relative to "NOW" to determine the value, ie...
NOW/HOUR
... Round to the start of the current hour
NOW-1DAY
... Exactly 1 day prior to now
NOW/DAY+6MONTHS+3DAYS
... 6 months and 3 days in the future from the start of
the current day
Consult the DateField javadocs for more information.
-->
<fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>
<!-- solr.TextField allows the specification of custom text analyzers
specified as a tokenizer and a list of token filters. Different
analyzers may be specified for indexing and querying.
The optional positionIncrementGap puts space between multiple fields of
this type on the same document, with the purpose of preventing false phrase
matching across fields.
For more info on customizing your analyzer chain, please see
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
-->
<!-- One can also specify an existing Analyzer class that has a
default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
-->
<!-- A text field that only splits on whitespace for exact matching of words -->
<fieldType name="text_ws" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- A text field that uses WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and non-alphanumeric chars,
so that a query of "wifi" or "wi fi" could match a document containing "Wi-Fi".
Synonyms and stopwords are customized by external files, and stemming is enabled.
Duplicate tokens at the same position (which may result from Stemmed Synonyms or
WordDelim parts) are removed.
-->
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!-- An unstemmed text field - good if one does not know the language of the field -->
<fieldType name="text_und" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Edge N gram type - for example for matching against queries with results
KeywordTokenizer leaves input string intact as a single term.
see: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
-->
<fieldType name="edge_n2_kw_text" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Setup simple analysis for spell checking -->
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LengthFilterFactory" min="4" max="20" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
<!-- This is an example of using the KeywordTokenizer along
With various TokenFilterFactories to produce a sortable field
that does not include some properties of the source text
-->
<fieldType name="sortString" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<!-- KeywordTokenizer does no actual tokenizing, so the entire
input string is preserved as a single token
-->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- The LowerCase TokenFilter does what you expect, which can be
when you want your sorting to be case insensitive
-->
<filter class="solr.LowerCaseFilterFactory" />
<!-- The TrimFilter removes any leading or trailing whitespace -->
<filter class="solr.TrimFilterFactory" />
<!-- The PatternReplaceFilter gives you the flexibility to use
Java Regular expression to replace any sequence of characters
matching a pattern with an arbitrary replacement string,
which may include back refrences to portions of the orriginal
string matched by the pattern.
See the Java Regular Expression documentation for more
infomation on pattern and replacement string syntax.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
<filter class="solr.PatternReplaceFilterFactory"
pattern="(^\p{Punct}+)" replacement="" replace="all"
/>
-->
</analyzer>
</fieldType>
<!-- A random sort type -->
<fieldType name="rand" class="solr.RandomSortField" indexed="true" />
<!-- since fields of this type are by default not stored or indexed, any data added to
them will be ignored outright
-->
<fieldtype name="ignored" stored="false" indexed="false" class="solr.StrField" />
<!-- Begin added types to use features in Solr 3.4+ -->
<fieldType name="point" class="solr.PointType" dimension="2" subFieldType="tdouble"/>
<!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. -->
<fieldType name="location" class="solr.LatLonType" subFieldType="tdouble"/>
<!-- A Geohash is a compact representation of a latitude longitude pair in a single field.
See http://wiki.apache.org/solr/SpatialSearch
-->
<fieldtype name="geohash" class="solr.GeoHashField"/>
<!-- End added Solr 3.4+ types -->
</types>
<!-- Following is a dynamic way to include other types, added by other contrib modules -->
<xi:include href="schema_extra_types.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<fields>
<!-- Valid attributes for fields:
name: mandatory - the name for the field
type: mandatory - the name of a previously defined type from the <types> section
indexed: true if this field should be indexed (searchable or sortable)
stored: true if this field should be retrievable
compressed: [false] if this field should be stored using gzip compression
(this will only apply if the field type is compressable; among
the standard field types, only TextField and StrField are)
multiValued: true if this field may contain multiple values per document
omitNorms: (expert) set to true to omit the norms associated with
this field (this disables length normalization and index-time
boosting for the field, and saves some memory). Only full-text
fields or fields that need an index-time boost need norms.
-->
<!-- The document id is usually derived from a site-spcific key (hash) and the
entity type and ID like:
Search Api :
The format used is $document->id = $index_id . '-' . $item_id
Apache Solr Search Integration
The format used is $document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
-->
<field name="id" type="string" indexed="true" stored="true" required="true" />
<!-- Search Api specific fields -->
<!-- item_id contains the entity ID, e.g. a node's nid. -->
<field name="item_id" type="string" indexed="true" stored="true" />
<!-- index_id is the machine name of the search index this entry belongs to. -->
<field name="index_id" type="string" indexed="true" stored="true" />
<!-- Since sorting by ID is explicitly allowed, store item_id also in a sortable way. -->
<copyField source="item_id" dest="sort_search_api_id" />
<!-- Apache Solr Search Integration specific fields -->
<!-- entity_id is the numeric object ID, e.g. Node ID, File ID -->
<field name="entity_id" type="long" indexed="true" stored="true" />
<!-- entity_type is 'node', 'file', 'user', or some other Drupal object type -->
<field name="entity_type" type="string" indexed="true" stored="true" />
<!-- bundle is a node type, or as appropriate for other entity types -->
<field name="bundle" type="string" indexed="true" stored="true"/>
<field name="bundle_name" type="string" indexed="true" stored="true"/>
<field name="site" type="string" indexed="true" stored="true"/>
<field name="hash" type="string" indexed="true" stored="true"/>
<field name="url" type="string" indexed="true" stored="true"/>
<!-- label is the default field for a human-readable string for this entity (e.g. the title of a node) -->
<field name="label" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- The string version of the title is used for sorting -->
<copyField source="label" dest="sort_label"/>
<!-- content is the default field for full text search - dump crap here -->
<field name="content" type="text" indexed="true" stored="true" termVectors="true"/>
<field name="teaser" type="text" indexed="false" stored="true"/>
<field name="path" type="string" indexed="true" stored="true"/>
<field name="path_alias" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- These are the fields that correspond to a Drupal node. The beauty of having
Lucene store title, body, type, etc., is that we retrieve them with the search
result set and don't need to go to the database with a node_load. -->
<field name="tid" type="long" indexed="true" stored="true" multiValued="true"/>
<field name="taxonomy_names" type="text" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<!-- Copy terms to a single field that contains all taxonomy term names -->
<copyField source="tm_vid_*" dest="taxonomy_names"/>
<!-- Here, default is used to create a "timestamp" field indicating
when each document was indexed.-->
<field name="timestamp" type="tdate" indexed="true" stored="true" default="NOW" multiValued="false"/>
<!-- This field is used to build the spellchecker index -->
<field name="spell" type="textSpell" indexed="true" stored="true" multiValued="true"/>
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<copyField source="label" dest="spell"/>
<copyField source="content" dest="spell"/>
<copyField source="ts_*" dest="spell"/>
<copyField source="tm_*" dest="spell"/>
<!-- Dynamic field definitions. If a field name is not found, dynamicFields
will be used if the name matches any of the patterns.
RESTRICTION: the glob-like pattern in the name attribute must have
a "*" only at the start or the end.
EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i)
Longer patterns will be matched first. if equal size patterns
both match, the first appearing in the schema will be used. -->
<!-- A set of fields to contain text extracted from HTML tag contents which we
can boost at query time. -->
<dynamicField name="tags_*" type="text" indexed="true" stored="false" omitNorms="true"/>
<!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and
the last letter is 's' for single valued, 'm' for multi-valued -->
<!-- We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/>
<!-- List of floats can be saved in a regular float field -->
<dynamicField name="fs_*" type="float" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fm_*" type="float" indexed="true" stored="true" multiValued="true"/>
<!-- List of doubles can be saved in a regular double field -->
<dynamicField name="ps_*" type="double" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pm_*" type="double" indexed="true" stored="true" multiValued="true"/>
<!-- List of booleans can be saved in a regular boolean field -->
<dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/>
<!-- Regular text (without processing) can be stored in a string field-->
<dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- Normal text fields are for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- Unstemmed text fields for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="tus_*" type="text_und" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tum_*" type="text_und" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- These text fields omit norms - useful for extracted text like taxonomy_names -->
<dynamicField name="tos_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
<!-- Special-purpose text fields -->
<dynamicField name="tes_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true" />
<dynamicField name="tem_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="true" omitTermFreqAndPositions="true" />
<dynamicField name="tws_*" type="text_ws" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="twm_*" type="text_ws" indexed="true" stored="true" multiValued="true"/>
<!-- trie dates are preferred, so give them the 2 letter prefix -->
<dynamicField name="ds_*" type="tdate" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="dm_*" type="tdate" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="its_*" type="tlong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="itm_*" type="tlong" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="fts_*" type="tfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ftm_*" type="tfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pts_*" type="tdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ptm_*" type="tdouble" indexed="true" stored="true" multiValued="true"/>
<!-- Binary fields can be populated using base64 encoded data. Useful e.g. for embedding
a small image in a search result using the data URI scheme -->
<dynamicField name="xs_*" type="binary" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="xm_*" type="binary" indexed="false" stored="true" multiValued="true"/>
<!-- In rare cases a date rather than tdate is needed for sortMissingLast -->
<dynamicField name="dds_*" type="date" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ddm_*" type="date" indexed="true" stored="true" multiValued="true"/>
<!-- Sortable fields, good for sortMissingLast support &
We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="iss_*" type="slong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ism_*" type="slong" indexed="true" stored="true" multiValued="true"/>
<!-- In rare cases a sfloat rather than tfloat is needed for sortMissingLast -->
<dynamicField name="fss_*" type="sfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fsm_*" type="sfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pss_*" type="sdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="psm_*" type="sdouble" indexed="true" stored="true" multiValued="true"/>
<!-- In case a 32 bit int is really needed, we provide these fields. 'h' is mnemonic for 'half word', i.e. 32 bit on 64 arch -->
<dynamicField name="hs_*" type="integer" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hm_*" type="integer" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hss_*" type="sint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hsm_*" type="sint" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hts_*" type="tint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="htm_*" type="tint" indexed="true" stored="true" multiValued="true"/>
<!-- Unindexed string fields that can be used to store values that won't be searchable -->
<dynamicField name="zs_*" type="string" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="zm_*" type="string" indexed="false" stored="true" multiValued="true"/>
<!-- Begin added fields to use features in Solr 3.4+
http://wiki.apache.org/solr/SpatialSearch#geodist_-_The_distance_function -->
<dynamicField name="points_*" type="point" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pointm_*" type="point" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="locs_*" type="location" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="locm_*" type="location" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="geos_*" type="geohash" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="geom_*" type="geohash" indexed="true" stored="true" multiValued="true"/>
<!-- External file fields -->
<dynamicField name="eff_*" type="file"/>
<!-- End added fields for Solr 3.4+ -->
<!-- Sortable version of the dynamic string field -->
<dynamicField name="sort_*" type="sortString" indexed="true" stored="false"/>
<copyField source="ss_*" dest="sort_*"/>
<!-- A random sort field -->
<dynamicField name="random_*" type="rand" indexed="true" stored="true"/>
<!-- This field is used to store access information (e.g. node access grants), as opposed to field data -->
<dynamicField name="access_*" type="integer" indexed="true" stored="false" multiValued="true"/>
<!-- The following causes solr to ignore any fields that don't already match an existing
field name or dynamic field, rather than reporting them as an error.
Alternately, change the type="ignored" to some other type e.g. "text" if you want
unknown fields indexed and/or stored by default -->
<dynamicField name="*" type="ignored" multiValued="true" />
</fields>
<!-- Following is a dynamic way to include other fields, added by other contrib modules -->
<xi:include href="schema_extra_fields.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<!-- Field to use to determine and enforce document uniqueness.
Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>content</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
</schema>

View File

@ -0,0 +1,23 @@
<fields>
<!--
Adding German dynamic field types to our Solr Schema
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
This also requires to enable the content in schema_extra_types.xml
-->
<!--
<field name="label_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="content_de" type="text_de" indexed="true" stored="true" termVectors="true"/>
<field name="teaser_de" type="text_de" indexed="false" stored="true"/>
<field name="path_alias_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="taxonomy_names_de" type="text_de" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<field name="spell_de" type="text_de" indexed="true" stored="true" multiValued="true"/>
<copyField source="label_de" dest="spell_de"/>
<copyField source="content_de" dest="spell_de"/>
<dynamicField name="tags_de_*" type="text_de" indexed="true" stored="false" omitNorms="true"/>
<dynamicField name="ts_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<dynamicField name="tos_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
-->
</fields>

View File

@ -0,0 +1,30 @@
<types>
<!--
Adding German language to our Solr Schema German
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
-->
<!--
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="1" catenateNumbers="1" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_de.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="0" catenateNumbers="0" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
-->
</types>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,80 @@
<!-- Spell Check
The spell check component can return a list of alternative spelling
suggestions.
http://wiki.apache.org/solr/SpellCheckComponent
-->
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<!-- Multiple "Spell Checkers" can be declared and used by this
component
-->
<!-- a spellchecker built from a field of the main index, and
written to disk
-->
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="spellcheckIndexDir">spellchecker</str>
<str name="buildOnOptimize">true</str>
<!-- uncomment this to require terms to occur in 1% of the documents in order to be included in the dictionary
<float name="thresholdTokenFrequency">.01</float>
-->
</lst>
<!--
Adding German spellhecker index to our Solr index
This also requires to enable the content in schema_extra_types.xml and schema_extra_fields.xml
-->
<!--
<lst name="spellchecker">
<str name="name">spellchecker_de</str>
<str name="field">spell_de</str>
<str name="spellcheckIndexDir">./spellchecker_de</str>
<str name="buildOnOptimize">true</str>
</lst>
-->
<!-- a spellchecker that uses a different distance measure -->
<!--
<lst name="spellchecker">
<str name="name">jarowinkler</str>
<str name="field">spell</str>
<str name="distanceMeasure">
org.apache.lucene.search.spell.JaroWinklerDistance
</str>
<str name="spellcheckIndexDir">spellcheckerJaro</str>
</lst>
-->
<!-- a spellchecker that use an alternate comparator
comparatorClass be one of:
1. score (default)
2. freq (Frequency first, then score)
3. A fully qualified class name
-->
<!--
<lst name="spellchecker">
<str name="name">freq</str>
<str name="field">lowerfilt</str>
<str name="spellcheckIndexDir">spellcheckerFreq</str>
<str name="comparatorClass">freq</str>
<str name="buildOnCommit">true</str>
-->
<!-- A spellchecker that reads the list of words from a file -->
<!--
<lst name="spellchecker">
<str name="classname">solr.FileBasedSpellChecker</str>
<str name="name">file</str>
<str name="sourceLocation">spellings.txt</str>
<str name="characterEncoding">UTF-8</str>
<str name="spellcheckIndexDir">spellcheckerFile</str>
</lst>
-->
</searchComponent>

View File

@ -0,0 +1,16 @@
# Defines Solr properties for this specific core.
solr.replication.master=false
solr.replication.slave=false
solr.replication.pollInterval=00:00:60
solr.replication.masterUrl=http://localhost:8983/solr
solr.replication.confFiles=schema.xml,mapping-ISOLatin1Accent.txt,protwords.txt,stopwords.txt,synonyms.txt,elevate.xml
solr.mlt.timeAllowed=2000
# You should not set your luceneMatchVersion to anything lower than your Solr
# Version.
solr.luceneMatchVersion=LUCENE_35
solr.pinkPony.timeAllowed=-1
# autoCommit after 10000 docs
solr.autoCommit.MaxDocs=10000
# autoCommit after 2 minutes
solr.autoCommit.MaxTime=120000
solr.contrib.dir=../../contrib

View File

@ -0,0 +1,4 @@
# Contains words which shouldn't be indexed for fulltext fields, e.g., because
# they're too common. For documentation of the format, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StopFilterFactory
# (Lines starting with a pound character # are ignored.)

View File

@ -0,0 +1,3 @@
# Contains synonyms to use for your index. For the format used, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
# (Lines starting with a pound character # are ignored.)

View File

@ -0,0 +1,31 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This file allows you to boost certain search items to the top of search
results. You can find out an item's ID by searching directly on the Solr
server. The item IDs are in general constructed as follows:
Search API:
$document->id = $index_id . '-' . $item_id;
Apache Solr Search Integration:
$document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
If you want this file to be automatically re-loaded when a Solr commit takes
place (e.g., if you have an automatic script active which updates elevate.xml
according to newly-indexed data), place it into Solr's data/ directory.
Otherwise, place it with the other configuration files into the conf/
directory.
See http://wiki.apache.org/solr/QueryElevationComponent for more information.
-->
<elevate>
<!-- Example for ranking the node #1 first in searches for "example query": -->
<!--
<query text="example query">
<doc id="default_node_index-1" />
<doc id="7v3jsc/node/1" />
</query>
-->
<!-- Multiple <query> elements can be specified, contained in one <elevate>. -->
<!-- <query text="...">...</query> -->
</elevate>

View File

@ -0,0 +1,14 @@
# This file contains character mappings for the default fulltext field type.
# The source characters (on the left) will be replaced by the respective target
# characters before any other processing takes place.
# Lines starting with a pound character # are ignored.
#
# For sensible defaults, use the mapping-ISOLatin1Accent.txt file distributed
# with the example application of your Solr version.
#
# Examples:
# "À" => "A"
# "\u00c4" => "A"
# "\u00c4" => "\u0041"
# "æ" => "ae"
# "\n" => " "

View File

@ -0,0 +1,7 @@
#-----------------------------------------------------------------------
# This file blocks words from being operated on by the stemmer and word delimiter.
&amp;
&lt;
&gt;
&#039;
&quot;

View File

@ -0,0 +1,552 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
This is the Solr schema file. This file should be named "schema.xml" and
should be in the conf directory under the solr home
(i.e. ./solr/conf/schema.xml by default)
or located where the classloader for the Solr webapp can find it.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
-->
<schema name="drupal-4.2-solr-4.x" version="1.3">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
Applications should change this to reflect the nature of the search collection.
version="1.2" is Solr's version number for the schema syntax and semantics. It should
not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields.
1.3: removed optional field compress feature
-->
<types>
<!-- field type definitions. The "name" attribute is
just a label to be used by field definitions. The "class"
attribute and any other attributes determine the real
behavior of the fieldType.
Class names starting with "solr" refer to java classes in the
org.apache.solr.analysis package.
-->
<!-- The StrField type is not analyzed, but indexed/stored verbatim.
- StrField and TextField support an optional compressThreshold which
limits compression (if enabled in the derived fields) to values which
exceed a certain size (in characters).
-->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings -->
<fieldtype name="binary" class="solr.BinaryField"/>
<!-- The optional sortMissingLast and sortMissingFirst attributes are
currently supported on types that are sorted internally as strings.
- If sortMissingLast="true", then a sort on this field will cause documents
without the field to come after documents with the field,
regardless of the requested sort order (asc or desc).
- If sortMissingFirst="true", then a sort on this field will cause documents
without the field to come before documents with the field,
regardless of the requested sort order.
- If sortMissingLast="false" and sortMissingFirst="false" (the default),
then default lucene sorting will be used which places docs without the
field first in an ascending sort and last in a descending sort.
-->
<!-- numeric field types that can be sorted, but are not optimized for range queries -->
<fieldType name="integer" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<!--
Note:
These should only be used for compatibility with existing indexes (created with older Solr versions)
or if "sortMissingFirst" or "sortMissingLast" functionality is needed. Use Trie based fields instead.
Numeric field types that manipulate the value into
a string value that isn't human-readable in its internal form,
but with a lexicographic ordering the same as the numeric ordering,
so that range queries work correctly.
-->
<fieldType name="sint" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.TrieFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.TrieLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/>
<!--
Numeric field types that index each value at various levels of precision
to accelerate range queries when the number of values between the range
endpoints is large. See the javadoc for NumericRangeQuery for internal
implementation details.
Smaller precisionStep values (specified in bits) will lead to more tokens
indexed per value, slightly larger index size, and faster range queries.
A precisionStep of 0 disables indexing at different precision levels.
-->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<!--
The ExternalFileField type gets values from an external file instead of the
index. This is useful for data such as rankings that might change frequently
and require different update frequencies than the documents they are
associated with.
-->
<fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/>
<fieldType name="file" keyField="id" defVal="1" stored="false" indexed="false" class="solr.ExternalFileField" valType="pfloat"/>
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime
The trailing "Z" designates UTC time and is mandatory.
Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory.
Expressions can also be used to denote calculations that should be
performed relative to "NOW" to determine the value, ie...
NOW/HOUR
... Round to the start of the current hour
NOW-1DAY
... Exactly 1 day prior to now
NOW/DAY+6MONTHS+3DAYS
... 6 months and 3 days in the future from the start of
the current day
Consult the DateField javadocs for more information.
-->
<fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>
<!-- solr.TextField allows the specification of custom text analyzers
specified as a tokenizer and a list of token filters. Different
analyzers may be specified for indexing and querying.
The optional positionIncrementGap puts space between multiple fields of
this type on the same document, with the purpose of preventing false phrase
matching across fields.
For more info on customizing your analyzer chain, please see
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
-->
<!-- One can also specify an existing Analyzer class that has a
default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
-->
<!-- A text field that only splits on whitespace for exact matching of words -->
<fieldType name="text_ws" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- A text field that uses WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and non-alphanumeric chars,
so that a query of "wifi" or "wi fi" could match a document containing "Wi-Fi".
Synonyms and stopwords are customized by external files, and stemming is enabled.
Duplicate tokens at the same position (which may result from Stemmed Synonyms or
WordDelim parts) are removed.
-->
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!-- An unstemmed text field - good if one does not know the language of the field -->
<fieldType name="text_und" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Edge N gram type - for example for matching against queries with results
KeywordTokenizer leaves input string intact as a single term.
see: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
-->
<fieldType name="edge_n2_kw_text" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- Setup simple analysis for spell checking -->
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LengthFilterFactory" min="4" max="20" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
<!-- This is an example of using the KeywordTokenizer along
With various TokenFilterFactories to produce a sortable field
that does not include some properties of the source text
-->
<fieldType name="sortString" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<!-- KeywordTokenizer does no actual tokenizing, so the entire
input string is preserved as a single token
-->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- The LowerCase TokenFilter does what you expect, which can be
when you want your sorting to be case insensitive
-->
<filter class="solr.LowerCaseFilterFactory" />
<!-- The TrimFilter removes any leading or trailing whitespace -->
<filter class="solr.TrimFilterFactory" />
<!-- The PatternReplaceFilter gives you the flexibility to use
Java Regular expression to replace any sequence of characters
matching a pattern with an arbitrary replacement string,
which may include back refrences to portions of the orriginal
string matched by the pattern.
See the Java Regular Expression documentation for more
infomation on pattern and replacement string syntax.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
<filter class="solr.PatternReplaceFilterFactory"
pattern="(^\p{Punct}+)" replacement="" replace="all"
/>
-->
</analyzer>
</fieldType>
<!-- A random sort type -->
<fieldType name="rand" class="solr.RandomSortField" indexed="true" />
<!-- since fields of this type are by default not stored or indexed, any data added to
them will be ignored outright
-->
<fieldtype name="ignored" stored="false" indexed="false" class="solr.StrField" />
<!-- Begin added types to use features in Solr 3.4+ -->
<fieldType name="point" class="solr.PointType" dimension="2" subFieldType="tdouble"/>
<!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. -->
<fieldType name="location" class="solr.LatLonType" subFieldType="tdouble"/>
<!-- A Geohash is a compact representation of a latitude longitude pair in a single field.
See http://wiki.apache.org/solr/SpatialSearch
-->
<fieldtype name="geohash" class="solr.GeoHashField"/>
<!-- End added Solr 3.4+ types -->
</types>
<!-- Following is a dynamic way to include other types, added by other contrib modules -->
<xi:include href="schema_extra_types.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<fields>
<!-- Valid attributes for fields:
name: mandatory - the name for the field
type: mandatory - the name of a previously defined type from the <types> section
indexed: true if this field should be indexed (searchable or sortable)
stored: true if this field should be retrievable
compressed: [false] if this field should be stored using gzip compression
(this will only apply if the field type is compressable; among
the standard field types, only TextField and StrField are)
multiValued: true if this field may contain multiple values per document
omitNorms: (expert) set to true to omit the norms associated with
this field (this disables length normalization and index-time
boosting for the field, and saves some memory). Only full-text
fields or fields that need an index-time boost need norms.
-->
<!-- The document id is usually derived from a site-spcific key (hash) and the
entity type and ID like:
Search Api :
The format used is $document->id = $index_id . '-' . $item_id
Apache Solr Search Integration
The format used is $document->id = $site_hash . '/' . $entity_type . '/' . $entity->id;
-->
<field name="id" type="string" indexed="true" stored="true" required="true" />
<!-- Add Solr Cloud version field as mentioned in
http://wiki.apache.org/solr/SolrCloud#Required_Config
-->
<field name="_version_" type="long" indexed="true" stored="true" multiValued="false"/>
<!-- Search Api specific fields -->
<!-- item_id contains the entity ID, e.g. a node's nid. -->
<field name="item_id" type="string" indexed="true" stored="true" />
<!-- index_id is the machine name of the search index this entry belongs to. -->
<field name="index_id" type="string" indexed="true" stored="true" />
<!-- Since sorting by ID is explicitly allowed, store item_id also in a sortable way. -->
<copyField source="item_id" dest="sort_search_api_id" />
<!-- Apache Solr Search Integration specific fields -->
<!-- entity_id is the numeric object ID, e.g. Node ID, File ID -->
<field name="entity_id" type="long" indexed="true" stored="true" />
<!-- entity_type is 'node', 'file', 'user', or some other Drupal object type -->
<field name="entity_type" type="string" indexed="true" stored="true" />
<!-- bundle is a node type, or as appropriate for other entity types -->
<field name="bundle" type="string" indexed="true" stored="true"/>
<field name="bundle_name" type="string" indexed="true" stored="true"/>
<field name="site" type="string" indexed="true" stored="true"/>
<field name="hash" type="string" indexed="true" stored="true"/>
<field name="url" type="string" indexed="true" stored="true"/>
<!-- label is the default field for a human-readable string for this entity (e.g. the title of a node) -->
<field name="label" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- The string version of the title is used for sorting -->
<copyField source="label" dest="sort_label"/>
<!-- content is the default field for full text search - dump crap here -->
<field name="content" type="text" indexed="true" stored="true" termVectors="true"/>
<field name="teaser" type="text" indexed="false" stored="true"/>
<field name="path" type="string" indexed="true" stored="true"/>
<field name="path_alias" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<!-- These are the fields that correspond to a Drupal node. The beauty of having
Lucene store title, body, type, etc., is that we retrieve them with the search
result set and don't need to go to the database with a node_load. -->
<field name="tid" type="long" indexed="true" stored="true" multiValued="true"/>
<field name="taxonomy_names" type="text" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<!-- Copy terms to a single field that contains all taxonomy term names -->
<copyField source="tm_vid_*" dest="taxonomy_names"/>
<!-- Here, default is used to create a "timestamp" field indicating
when each document was indexed.-->
<field name="timestamp" type="tdate" indexed="true" stored="true" default="NOW" multiValued="false"/>
<!-- This field is used to build the spellchecker index -->
<field name="spell" type="textSpell" indexed="true" stored="true" multiValued="true"/>
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<copyField source="label" dest="spell"/>
<copyField source="content" dest="spell"/>
<copyField source="ts_*" dest="spell"/>
<copyField source="tm_*" dest="spell"/>
<!-- Dynamic field definitions. If a field name is not found, dynamicFields
will be used if the name matches any of the patterns.
RESTRICTION: the glob-like pattern in the name attribute must have
a "*" only at the start or the end.
EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i)
Longer patterns will be matched first. if equal size patterns
both match, the first appearing in the schema will be used. -->
<!-- A set of fields to contain text extracted from HTML tag contents which we
can boost at query time. -->
<dynamicField name="tags_*" type="text" indexed="true" stored="false" omitNorms="true"/>
<!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and
the last letter is 's' for single valued, 'm' for multi-valued -->
<!-- We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/>
<!-- List of floats can be saved in a regular float field -->
<dynamicField name="fs_*" type="float" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fm_*" type="float" indexed="true" stored="true" multiValued="true"/>
<!-- List of doubles can be saved in a regular double field -->
<dynamicField name="ps_*" type="double" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pm_*" type="double" indexed="true" stored="true" multiValued="true"/>
<!-- List of booleans can be saved in a regular boolean field -->
<dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/>
<!-- Regular text (without processing) can be stored in a string field-->
<dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- Normal text fields are for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- Unstemmed text fields for full text - the relevance of a match depends on the length of the text -->
<dynamicField name="tus_*" type="text_und" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tum_*" type="text_und" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<!-- These text fields omit norms - useful for extracted text like taxonomy_names -->
<dynamicField name="tos_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
<!-- Special-purpose text fields -->
<dynamicField name="tes_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true" />
<dynamicField name="tem_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="true" omitTermFreqAndPositions="true" />
<dynamicField name="tws_*" type="text_ws" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="twm_*" type="text_ws" indexed="true" stored="true" multiValued="true"/>
<!-- trie dates are preferred, so give them the 2 letter prefix -->
<dynamicField name="ds_*" type="tdate" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="dm_*" type="tdate" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="its_*" type="tlong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="itm_*" type="tlong" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="fts_*" type="tfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ftm_*" type="tfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pts_*" type="tdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ptm_*" type="tdouble" indexed="true" stored="true" multiValued="true"/>
<!-- Binary fields can be populated using base64 encoded data. Useful e.g. for embedding
a small image in a search result using the data URI scheme -->
<dynamicField name="xs_*" type="binary" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="xm_*" type="binary" indexed="false" stored="true" multiValued="true"/>
<!-- In rare cases a date rather than tdate is needed for sortMissingLast -->
<dynamicField name="dds_*" type="date" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ddm_*" type="date" indexed="true" stored="true" multiValued="true"/>
<!-- Sortable fields, good for sortMissingLast support &
We use long for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="iss_*" type="slong" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="ism_*" type="slong" indexed="true" stored="true" multiValued="true"/>
<!-- In rare cases a sfloat rather than tfloat is needed for sortMissingLast -->
<dynamicField name="fss_*" type="sfloat" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="fsm_*" type="sfloat" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="pss_*" type="sdouble" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="psm_*" type="sdouble" indexed="true" stored="true" multiValued="true"/>
<!-- In case a 32 bit int is really needed, we provide these fields. 'h' is mnemonic for 'half word', i.e. 32 bit on 64 arch -->
<dynamicField name="hs_*" type="integer" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hm_*" type="integer" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hss_*" type="sint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="hsm_*" type="sint" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="hts_*" type="tint" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="htm_*" type="tint" indexed="true" stored="true" multiValued="true"/>
<!-- Unindexed string fields that can be used to store values that won't be searchable -->
<dynamicField name="zs_*" type="string" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="zm_*" type="string" indexed="false" stored="true" multiValued="true"/>
<!-- Begin added fields to use features in Solr 3.4+
http://wiki.apache.org/solr/SpatialSearch#geodist_-_The_distance_function -->
<dynamicField name="points_*" type="point" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="pointm_*" type="point" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="locs_*" type="location" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="locm_*" type="location" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="geos_*" type="geohash" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="geom_*" type="geohash" indexed="true" stored="true" multiValued="true"/>
<!-- External file fields -->
<dynamicField name="eff_*" type="file"/>
<!-- End added fields for Solr 3.4+ -->
<!-- Sortable version of the dynamic string field -->
<dynamicField name="sort_*" type="sortString" indexed="true" stored="false"/>
<copyField source="ss_*" dest="sort_*"/>
<!-- A random sort field -->
<dynamicField name="random_*" type="rand" indexed="true" stored="true"/>
<!-- This field is used to store access information (e.g. node access grants), as opposed to field data -->
<dynamicField name="access_*" type="integer" indexed="true" stored="false" multiValued="true"/>
<!-- The following causes solr to ignore any fields that don't already match an existing
field name or dynamic field, rather than reporting them as an error.
Alternately, change the type="ignored" to some other type e.g. "text" if you want
unknown fields indexed and/or stored by default -->
<dynamicField name="*" type="ignored" multiValued="true" />
</fields>
<!-- Following is a dynamic way to include other fields, added by other contrib modules -->
<xi:include href="schema_extra_fields.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:fallback></xi:fallback>
</xi:include>
<!-- Field to use to determine and enforce document uniqueness.
Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>content</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
</schema>

View File

@ -0,0 +1,23 @@
<fields>
<!--
Adding German dynamic field types to our Solr Schema
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
This also requires to enable the content in schema_extra_types.xml
-->
<!--
<field name="label_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="content_de" type="text_de" indexed="true" stored="true" termVectors="true"/>
<field name="teaser_de" type="text_de" indexed="false" stored="true"/>
<field name="path_alias_de" type="text_de" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<field name="taxonomy_names_de" type="text_de" indexed="true" stored="false" termVectors="true" multiValued="true" omitNorms="true"/>
<field name="spell_de" type="text_de" indexed="true" stored="true" multiValued="true"/>
<copyField source="label_de" dest="spell_de"/>
<copyField source="content_de" dest="spell_de"/>
<dynamicField name="tags_de_*" type="text_de" indexed="true" stored="false" omitNorms="true"/>
<dynamicField name="ts_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true"/>
<dynamicField name="tm_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true"/>
<dynamicField name="tos_de_*" type="text_de" indexed="true" stored="true" multiValued="false" termVectors="true" omitNorms="true"/>
<dynamicField name="tom_de_*" type="text_de" indexed="true" stored="true" multiValued="true" termVectors="true" omitNorms="true"/>
-->
</fields>

View File

@ -0,0 +1,30 @@
<types>
<!--
Adding German language to our Solr Schema German
If you enable this, make sure you have a folder called lang with stopwords_de.txt
and synonyms_de.txt in there
-->
<!--
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="1" catenateNumbers="1" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_de.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" words="lang/stopwords_de.txt" format="snowball" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" splitOnNumerics="1" catenateWords="0" catenateNumbers="0" catenateAll="0" protected="protwords.txt" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
-->
</types>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,80 @@
<!-- Spell Check
The spell check component can return a list of alternative spelling
suggestions.
http://wiki.apache.org/solr/SpellCheckComponent
-->
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<!-- Multiple "Spell Checkers" can be declared and used by this
component
-->
<!-- a spellchecker built from a field of the main index, and
written to disk
-->
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="spellcheckIndexDir">spellchecker</str>
<str name="buildOnOptimize">true</str>
<!-- uncomment this to require terms to occur in 1% of the documents in order to be included in the dictionary
<float name="thresholdTokenFrequency">.01</float>
-->
</lst>
<!--
Adding German spellhecker index to our Solr index
This also requires to enable the content in schema_extra_types.xml and schema_extra_fields.xml
-->
<!--
<lst name="spellchecker">
<str name="name">spellchecker_de</str>
<str name="field">spell_de</str>
<str name="spellcheckIndexDir">./spellchecker_de</str>
<str name="buildOnOptimize">true</str>
</lst>
-->
<!-- a spellchecker that uses a different distance measure -->
<!--
<lst name="spellchecker">
<str name="name">jarowinkler</str>
<str name="field">spell</str>
<str name="distanceMeasure">
org.apache.lucene.search.spell.JaroWinklerDistance
</str>
<str name="spellcheckIndexDir">spellcheckerJaro</str>
</lst>
-->
<!-- a spellchecker that use an alternate comparator
comparatorClass be one of:
1. score (default)
2. freq (Frequency first, then score)
3. A fully qualified class name
-->
<!--
<lst name="spellchecker">
<str name="name">freq</str>
<str name="field">lowerfilt</str>
<str name="spellcheckIndexDir">spellcheckerFreq</str>
<str name="comparatorClass">freq</str>
<str name="buildOnCommit">true</str>
-->
<!-- A spellchecker that reads the list of words from a file -->
<!--
<lst name="spellchecker">
<str name="classname">solr.FileBasedSpellChecker</str>
<str name="name">file</str>
<str name="sourceLocation">spellings.txt</str>
<str name="characterEncoding">UTF-8</str>
<str name="spellcheckIndexDir">spellcheckerFile</str>
</lst>
-->
</searchComponent>

View File

@ -0,0 +1,20 @@
# Defines Solr properties for this specific core.
solr.replication.master=false
solr.replication.slave=false
solr.replication.pollInterval=00:00:60
solr.replication.masterUrl=http://localhost:8983/solr
solr.replication.confFiles=schema.xml,mapping-ISOLatin1Accent.txt,protwords.txt,stopwords.txt,synonyms.txt,elevate.xml
solr.mlt.timeAllowed=2000
# You should not set your luceneMatchVersion to anything lower than your Solr
# Version.
solr.luceneMatchVersion=LUCENE_40
solr.pinkPony.timeAllowed=-1
# autoCommit after 10000 docs
solr.autoCommit.MaxDocs=10000
# autoCommit after 2 minutes
solr.autoCommit.MaxTime=120000
# autoSoftCommit after 2000 docs
solr.autoSoftCommit.MaxDocs=2000
# autoSoftCommit after 10 seconds
solr.autoSoftCommit.MaxTime=10000
solr.contrib.dir=../../../contrib

View File

@ -0,0 +1,4 @@
# Contains words which shouldn't be indexed for fulltext fields, e.g., because
# they're too common. For documentation of the format, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StopFilterFactory
# (Lines starting with a pound character # are ignored.)

View File

@ -0,0 +1,3 @@
# Contains synonyms to use for your index. For the format used, see
# http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
# (Lines starting with a pound character # are ignored.)