zorba-coders team mailing list archive
-
zorba-coders team
-
Mailing list archive
-
Message #02729
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
Matthias Brantner has proposed merging lp:~zorba-coders/zorba/caching into lp:zorba.
Requested reviews:
Matthias Brantner (matthias-brantner)
Markos Zaharioudakis (markos-za)
For more details, see:
https://code.launchpad.net/~zorba-coders/zorba/caching/+merge/85373
- automatic caching of recursive, non-sequential, and deterministic functions with atomic parameter and return types
- %ann:cache and %ann:no-cache for controlling function result caching
--
https://code.launchpad.net/~zorba-coders/zorba/caching/+merge/85373
Your team Zorba Coders is subscribed to branch lp:zorba.
=== modified file 'ChangeLog'
--- ChangeLog 2011-12-01 16:19:52 +0000
+++ ChangeLog 2011-12-12 18:25:31 +0000
@@ -1,5 +1,9 @@
Zorba - The XQuery Processor
+version 2.2
+ * Caching of results for recursive functions with atomic parameter and return types.
+ * Added %ann:cache and %ann:no-cache to enable or disable caching of results of functions with atomic parameter and return types.
+
version 2.1
New Features:
@@ -74,6 +78,7 @@
* General index cannot be declared as unique if the type of its key is
xs:anyAtomicType or xs:untypedAtomic.
* Added undo for node revalidation
+ * Optimization for count(collection()) expressions
* Fixed bug #867133 (SWIG PHP build failure on Mac OSX)
* Fixed bug #872796 (validate-in-place can interfere with other update primitives)
* Fixed bug #872799 (validate-in-place can set incorrect types)
=== modified file 'doc/zorba/options.dox'
--- doc/zorba/options.dox 2011-09-14 06:15:19 +0000
+++ doc/zorba/options.dox 2011-12-12 18:25:31 +0000
@@ -278,6 +278,27 @@
In order to be able to use the value twice, the <tt>string:materialize</tt> function must be used to materialize the entire contents of the file <tt>myfile.txt</tt> in memory.
Otherwise, the error zerr:ZSTR0055 is raised.
+\paragraph caching_annotation Caching Results of Functions
+Caching of function results might improve the performance if computational expensive functions are invoked multiple times with the same arguments.
+
+Zorba automatically caches results of recursive, deterministic, and non-sequential functions whose parameter and return types are subtypes of xs:anyAtomicType if at least optimization level O1 is used.
+Specifically, if such a function is called twice with the same arguments, the result of the second call will return the same value without re-evaluating the function.
+
+For example, in the following recursive function computing a fibonacci number, each result is automatically cached and, hence, dramatically improves the performance.
+
+\include zorba/udf/udf-fib-rec.xq
+
+Specifically, this optimization reduces the complexity of the function from O(1.6^n) to O(n).
+
+In order to explicitly disable function caching, the user can specify the <tt>%ann:no-cache</tt> annotation.
+
+In addition, the user can use the <tt>%ann:cache</tt> annotation to cache the results of functions other than the ones that are automatically cached.
+However, this will only work if the function is not updating and its parameter and return types are subtypes of xs:anyAtomicType.
+Zorba will raise a warning if caching is explicitly enabled but the function does not meet this criteria (zwarn:ZWST0005).
+
+Please note, that explicitly enforcing caching for sequential or nondeterministic functions might not give the intended result.
+In such cases, Zorba will raise a warning (zwarn:ZWST0006).
+
\paragraph collection_index_annotations Annotations on Collections and Indexes
The \ref xqddf uses annotations to assign properties to collections and indexes.
=== modified file 'include/zorba/pregenerated/diagnostic_list.h'
--- include/zorba/pregenerated/diagnostic_list.h 2011-11-15 08:23:20 +0000
+++ include/zorba/pregenerated/diagnostic_list.h 2011-12-12 18:25:31 +0000
@@ -600,6 +600,8 @@
extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZDDY0034_INDEX_RANGE_VALUE_PROBE_BAD_KEY_TYPES;
+extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZDDY0035_INDEX_GENERAL_INSERT;
+
extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZDDY0031_IC_NOT_DECLARED;
extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZDDY0032_IC_NOT_ACTIVATED;
@@ -752,6 +754,10 @@
extern ZORBA_DLL_PUBLIC ZorbaWarningCode ZWST0004_AMBIGUOUS_SEQUENTIAL_FLWOR;
+extern ZORBA_DLL_PUBLIC ZorbaWarningCode ZWST0005_CACHING_NOT_POSSIBLE;
+
+extern ZORBA_DLL_PUBLIC ZorbaWarningCode ZWST0006_CACHING_MIGHT_NOT_BE_INTENDED;
+
} // namespace zwarn
} // namespace zorba
#endif /* ZORBA_DIAGNOSTIC_LIST_API_H */
=== modified file 'modules/com/zorba-xquery/www/modules/pregenerated/errors.xq'
--- modules/com/zorba-xquery/www/modules/pregenerated/errors.xq 2011-11-15 08:23:20 +0000
+++ modules/com/zorba-xquery/www/modules/pregenerated/errors.xq 2011-12-12 18:25:31 +0000
@@ -501,6 +501,10 @@
(:~
:)
+declare variable $zerr:ZDDY0035 as xs:QName := fn:QName($zerr:NS, "zerr:ZDDY0035");
+
+(:~
+:)
declare variable $zerr:ZDDY0031 as xs:QName := fn:QName($zerr:NS, "zerr:ZDDY0031");
(:~
=== modified file 'modules/com/zorba-xquery/www/modules/pregenerated/warnings.xq'
--- modules/com/zorba-xquery/www/modules/pregenerated/warnings.xq 2011-11-15 08:10:49 +0000
+++ modules/com/zorba-xquery/www/modules/pregenerated/warnings.xq 2011-12-12 18:25:31 +0000
@@ -53,4 +53,23 @@
(:~
:)
-declare variable $zwarn:ZWST0004 as xs:QName := fn:QName($zwarn:NS, "zwarn:ZWST0004");
\ No newline at end of file
+declare variable $zwarn:ZWST0004 as xs:QName := fn:QName($zwarn:NS, "zwarn:ZWST0004");
+
+(:~
+ :
+ : This warning is raised if the user explicitly enables caching
+ : of function results (using the %ann:cache annotation) but the function
+ : is updating or its parameter and return types are not subtypes of
+ : xs:anyAtomicType.
+ :
+:)
+declare variable $zwarn:ZWST0005 as xs:QName := fn:QName($zwarn:NS, "zwarn:ZWST0005");
+
+(:~
+ :
+ : This warning is raised if the user explicitly enables caching
+ : of function results (using the %ann:cache annotation) and the function
+ : is annotated as sequential or nondeterministic.
+ :
+:)
+declare variable $zwarn:ZWST0006 as xs:QName := fn:QName($zwarn:NS, "zwarn:ZWST0006");
\ No newline at end of file
=== modified file 'src/annotations/annotations.cpp'
--- src/annotations/annotations.cpp 2011-11-07 06:32:00 +0000
+++ src/annotations/annotations.cpp 2011-12-12 18:25:31 +0000
@@ -100,6 +100,9 @@
ZANN(streamable, streamable);
+ ZANN(cache, cache);
+ ZANN(no-cache, nocache);
+
//
// Zorba annotations - xqddf
//
@@ -168,6 +171,10 @@
ZANN(zann_nonsequential));
theRuleSet.push_back(
+ ZANN(zann_cache) |
+ ZANN(zann_nocache));
+
+ theRuleSet.push_back(
ZANN(fn_private) |
ZANN(fn_public));
@@ -433,6 +440,5 @@
}
-
} /* namespace zorba */
/* vim:set et sw=2 ts=2: */
=== modified file 'src/annotations/annotations.h'
--- src/annotations/annotations.h 2011-11-01 13:47:10 +0000
+++ src/annotations/annotations.h 2011-12-12 18:25:31 +0000
@@ -55,6 +55,8 @@
zann_nonassignable,
zann_sequential,
zann_nonsequential,
+ zann_cache,
+ zann_nocache,
zann_variadic,
zann_streamable,
zann_unique,
=== modified file 'src/compiler/codegen/plan_visitor.cpp'
--- src/compiler/codegen/plan_visitor.cpp 2011-12-07 20:46:23 +0000
+++ src/compiler/codegen/plan_visitor.cpp 2011-12-12 18:25:31 +0000
@@ -98,6 +98,7 @@
#endif
#include "functions/function.h"
+#include "functions/udf.h"
#include "functions/library.h"
#include "types/typeops.h"
@@ -2320,11 +2321,14 @@
dynamic_cast<EnclosedIterator*>(iter.getp())->setInUpdateExpr();
}
}
-#if 0
else if (func->isUdf())
{
+ // need to computeResultCaching here for iterprint to work
const user_function* udf = static_cast<const user_function*>(func);
+ udf->computeResultCaching(theCCB->theXQueryDiagnostics);
+ }
+#if 0
if (udf->isExiting())
{
TypeManager* tm = v.get_type_manager();
=== modified file 'src/diagnostics/diagnostic_en.xml'
--- src/diagnostics/diagnostic_en.xml 2011-12-07 20:46:23 +0000
+++ src/diagnostics/diagnostic_en.xml 2011-12-12 18:25:31 +0000
@@ -2013,6 +2013,10 @@
<value>"$1": index range-value probe has search keys with incompatible types</value>
</diagnostic>
+ <diagnostic code="ZDDY0035" name="INDEX_GENERAL_INSERT">
+ <value>"$1": index inserting more than one key not allowed for general index</value>
+ </diagnostic>
+
<diagnostic code="ZDDY0031" name="IC_NOT_DECLARED">
<value>"$1": integrity constraint is not declared</value>
</diagnostic>
@@ -2323,6 +2327,37 @@
<value>Sequential FLWOR expr may not have the semantics you expect</value>
</diagnostic>
+ <diagnostic code="ZWST0005" name="CACHING_NOT_POSSIBLE">
+ <comment>
+ This warning is raised if the user explicitly enables caching
+ of function results (using the %ann:cache annotation) but the function
+ is updating or its parameter and return types are not subtypes of
+ xs:anyAtomicType.
+ </comment>
+ <value>"$1": function caching not possible; $2</value>
+ <entry key="RETURN_TYPE">
+ <value>return type ($3) is not subtype of xs:anyAtomicType</value>
+ </entry>
+ <entry key="PARAM_TYPE">
+ <value>type of parameter $3 is $4 which is not a subtype of xs:anyAtomicType</value>
+ </entry>
+ <entry key="UPDATING">
+ <value>function is updating</value>
+ </entry>
+ <entry key="VARIADIC">
+ <value>function is variadic</value>
+ </entry>
+ </diagnostic>
+
+ <diagnostic code="ZWST0006" name="CACHING_MIGHT_NOT_BE_INTENDED">
+ <comment>
+ This warning is raised if the user explicitly enables caching
+ of function results (using the %ann:cache annotation) and the function
+ is annotated as sequential or nondeterministic.
+ </comment>
+ <value>"$1": function caching might not give the intended result because the function is declared as $2</value>
+ </diagnostic>
+
</namespace>
<!--////////// Subvalues /////////////////////////////////////////////////-->
=== modified file 'src/diagnostics/pregenerated/diagnostic_list.cpp'
--- src/diagnostics/pregenerated/diagnostic_list.cpp 2011-11-15 08:23:20 +0000
+++ src/diagnostics/pregenerated/diagnostic_list.cpp 2011-12-12 18:25:31 +0000
@@ -879,6 +879,9 @@
ZorbaErrorCode ZDDY0034_INDEX_RANGE_VALUE_PROBE_BAD_KEY_TYPES( "ZDDY0034" );
+ZorbaErrorCode ZDDY0035_INDEX_GENERAL_INSERT( "ZDDY0035" );
+
+
ZorbaErrorCode ZDDY0031_IC_NOT_DECLARED( "ZDDY0031" );
@@ -1104,6 +1107,12 @@
ZorbaWarningCode ZWST0004_AMBIGUOUS_SEQUENTIAL_FLWOR( "ZWST0004" );
+ZorbaWarningCode ZWST0005_CACHING_NOT_POSSIBLE( "ZWST0005" );
+
+
+ZorbaWarningCode ZWST0006_CACHING_MIGHT_NOT_BE_INTENDED( "ZWST0006" );
+
+
} // namespace zwarn
} // namespace zorba
=== modified file 'src/diagnostics/pregenerated/dict_en.cpp'
--- src/diagnostics/pregenerated/dict_en.cpp 2011-12-01 16:19:52 +0000
+++ src/diagnostics/pregenerated/dict_en.cpp 2011-12-12 18:25:31 +0000
@@ -298,6 +298,7 @@
{ "ZDDY0032", "\"$1\": integrity constraint is not activated" },
{ "ZDDY0033", "\"$1\": integrity constraint not met for collection \"$2\"" },
{ "ZDDY0034", "\"$1\": index range-value probe has search keys with incompatible types" },
+ { "ZDDY0035", "\"$1\": index inserting more than one key not allowed for general index" },
{ "ZDST0001", "\"$1\": collection already declared" },
{ "ZDST0002", "\"$1\": collection already imported into module \"$2\"" },
{ "ZDST0003", "\"$1\": collection declaration not allowed in main module" },
@@ -360,6 +361,8 @@
{ "ZWST0002", "\"$1\": unknown or unsupported annotation" },
{ "ZWST0003", "\"$1\": function declared sequential, but has non-sequential body" },
{ "ZWST0004", "Sequential FLWOR expr may not have the semantics you expect" },
+ { "ZWST0005", "\"$1\": function caching not possible; $2" },
+ { "ZWST0006", "\"$1\": function caching might not give the intended result because the function is declared as $2" },
{ "ZXQD0001", "\"$1\": prefix not declared when calling function \"$2\" from $3" },
{ "ZXQD0002", "\"$1\": $2" },
{ "ZXQD0003", "inconsistent options to the parse-xml-fragment() function: $1" },
@@ -664,6 +667,10 @@
{ "~XUST0002_UDF_2", "\"$2\": function declared updating but body is not updating or vacuous" },
{ "~ZDST0060_unknown_localname", "unknown localname ($3)" },
{ "~ZDST0060_unknown_namespace", "unknown namespace ($3)" },
+ { "~ZWST0005_PARAM_TYPE", "type of parameter $3 is $4 which is not a subtype of xs:anyAtomicType" },
+ { "~ZWST0005_RETURN_TYPE", "return type ($3) is not subtype of xs:anyAtomicType" },
+ { "~ZWST0005_UPDATING", "function is updating" },
+ { "~ZWST0005_VARIADIC", "function is variadic" },
{ "~ZXQD0004_NON_NEGATIVE", "given value must be non-negative ($2)" },
{ "~ZXQD0004_NOT_WITHIN_RANGE", "not within allowed range ($2)" },
{ "~ZXQP0004_TypeOps_is_in_scope_ForFunctionItemTypes", "TypeOps::is_in_scope() for function-item types" },
=== modified file 'src/functions/udf.cpp'
--- src/functions/udf.cpp 2011-10-18 17:37:41 +0000
+++ src/functions/udf.cpp 2011-12-12 18:25:31 +0000
@@ -26,10 +26,15 @@
#include "compiler/rewriter/framework/rewriter.h"
#include "functions/udf.h"
+#include "annotations/annotations.h"
#include "functions/function_impl.h"
+#include "diagnostics/xquery_warning.h"
+
#include "types/typeops.h"
+#include "store/api/index.h" // needed for destruction of the cache
+
namespace zorba
{
@@ -54,7 +59,10 @@
theIsExiting(false),
theIsLeaf(true),
theIsOptimized(false),
- thePlanStateSize(0)
+ thePlanStateSize(0),
+ theCache(0),
+ theCacheResults(false),
+ theCacheComputed(false)
{
setFlag(FunctionConsts::isUDF);
resetFlag(FunctionConsts::isBuiltin);
@@ -318,7 +326,7 @@
/*******************************************************************************
********************************************************************************/
- PlanIter_t user_function::getPlan(CompilerCB* ccb, uint32_t& planStateSize)
+PlanIter_t user_function::getPlan(CompilerCB* ccb, uint32_t& planStateSize)
{
if (thePlan == NULL)
{
@@ -372,9 +380,205 @@
/*******************************************************************************
********************************************************************************/
-CODEGEN_DEF(user_function)
-{
- return new UDFunctionCallIterator(aSctx, aLoc, aArgs, this);
+store::Index* user_function::getCache() const
+{
+ return theCache;
+}
+
+
+/*******************************************************************************
+
+********************************************************************************/
+void user_function::setCache(store::Index* aCache)
+{
+ theCache = aCache;
+}
+
+
+/*******************************************************************************
+********************************************************************************/
+bool user_function::cacheResults() const
+{
+ return theCacheResults;
+}
+
+
+/*******************************************************************************
+ only cache recursive (non-sequential, non-updating, deterministic)
+ functions with singleton atomic input and output
+********************************************************************************/
+void user_function::computeResultCaching(XQueryDiagnostics* diag) const
+{
+ if (theCacheComputed)
+ {
+ return;
+ }
+
+ struct OnExit {
+ private:
+ bool& theResult;
+ bool& theCacheComputed;
+
+ public:
+ OnExit(bool& aResult, bool& aCacheComputed)
+ : theResult(aResult),
+ theCacheComputed(aCacheComputed) {}
+
+ void cache() { theResult = true; }
+
+ ~OnExit()
+ {
+ theCacheComputed = true;
+ }
+ };
+
+ // will be destroyed when the function is exited
+ // set caching to true if cache() is called
+ OnExit lExit(theCacheResults, theCacheComputed);
+
+ // check necessary conditions
+ // %ann:cache or not %ann:no-cache
+ if (theAnnotationList && theAnnotationList->contains(AnnotationInternal::zann_nocache))
+ {
+ return;
+ }
+
+ // was the %ann:cache annotation given explicitly by the user
+ bool lExplicitCacheRequest = theAnnotationList
+ ?theAnnotationList->contains(AnnotationInternal::zann_cache)
+ :false;
+
+ if (isVariadic())
+ {
+ if (lExplicitCacheRequest)
+ {
+ diag->add_warning(
+ NEW_XQUERY_WARNING(
+ zwarn::ZWST0005_CACHING_NOT_POSSIBLE,
+ WARN_PARAMS(
+ getName()->getStringValue(),
+ ZED( ZWST0005_VARIADIC )
+ ),
+ WARN_LOC(theLoc)
+ )
+ );
+ }
+ return;
+ }
+
+ // parameter and return types are subtype of xs:anyAtomicType?
+ const xqtref_t& lRes = theSignature.returnType();
+ TypeManager* tm = lRes->get_manager();
+
+ if (!TypeOps::is_subtype(tm,
+ *lRes,
+ *GENV_TYPESYSTEM.ANY_ATOMIC_TYPE_ONE,
+ theLoc))
+ {
+ if (lExplicitCacheRequest)
+ {
+ diag->add_warning(
+ NEW_XQUERY_WARNING(
+ zwarn::ZWST0005_CACHING_NOT_POSSIBLE,
+ WARN_PARAMS(
+ getName()->getStringValue(),
+ ZED( ZWST0005_RETURN_TYPE ),
+ lRes->toString()
+ ),
+ WARN_LOC(theLoc)
+ )
+ );
+ }
+ return;
+ }
+
+ size_t lArity = theSignature.paramCount();
+ for (size_t i = 0; i < lArity; ++i)
+ {
+ const xqtref_t& lArg = theSignature[i];
+ if (!TypeOps::is_subtype(tm,
+ *lArg,
+ *GENV_TYPESYSTEM.ANY_ATOMIC_TYPE_ONE,
+ theLoc))
+ {
+ if (lExplicitCacheRequest)
+ {
+ diag->add_warning(
+ NEW_XQUERY_WARNING(
+ zwarn::ZWST0005_CACHING_NOT_POSSIBLE,
+ WARN_PARAMS(
+ getName()->getStringValue(),
+ ZED( ZWST0005_PARAM_TYPE ),
+ i+1,
+ lArg->toString()
+ ),
+ WARN_LOC(theLoc)
+ )
+ );
+ }
+ return;
+ }
+ }
+
+ // function updating?
+ if (isUpdating())
+ {
+ if (lExplicitCacheRequest)
+ {
+ diag->add_warning(
+ NEW_XQUERY_WARNING(
+ zwarn::ZWST0005_CACHING_NOT_POSSIBLE,
+ WARN_PARAMS(
+ getName()->getStringValue(),
+ ZED( ZWST0005_UPDATING )
+ ),
+ WARN_LOC(theLoc)
+ )
+ );
+ }
+ return;
+ }
+
+ if (isSequential() || !isDeterministic())
+ {
+ if (lExplicitCacheRequest)
+ {
+ diag->add_warning(
+ NEW_XQUERY_WARNING(
+ zwarn::ZWST0006_CACHING_MIGHT_NOT_BE_INTENDED,
+ WARN_PARAMS(
+ getName()->getStringValue(),
+ (isSequential()?"sequential":"non-deterministic")
+ ),
+ WARN_LOC(theLoc)
+ )
+ );
+ lExit.cache();
+ }
+ return;
+ }
+
+
+ // optimization is prerequisite before invoking isRecursive
+ if (!lExplicitCacheRequest && isOptimized() && !isRecursive())
+ {
+ return;
+ }
+
+ lExit.cache();
+}
+
+/*******************************************************************************
+
+********************************************************************************/
+PlanIter_t user_function::codegen(
+ CompilerCB* cb,
+ static_context* sctx,
+ const QueryLoc& loc,
+ std::vector<PlanIter_t>& argv,
+ AnnotationHolder& ann) const
+{
+ return new UDFunctionCallIterator(sctx, loc, argv, this);
}
=== modified file 'src/functions/udf.h'
--- src/functions/udf.h 2011-10-18 17:37:41 +0000
+++ src/functions/udf.h 2011-12-12 18:25:31 +0000
@@ -25,6 +25,12 @@
namespace zorba
{
+ namespace store
+ {
+ class Index;
+ typedef rchandle<Index> Index_t;
+ }
+
/*******************************************************************************
A udf with params $x1, $x2, ..., $xn and a body_expr is translated into a
@@ -102,6 +108,10 @@
uint32_t thePlanStateSize;
std::vector<ArgVarRefs> theArgVarsRefs;
+ mutable store::Index_t theCache;
+ mutable bool theCacheResults;
+ mutable bool theCacheComputed;
+
public:
SERIALIZABLE_CLASS(user_function)
user_function(::zorba::serialization::Archiver& ar);
@@ -160,12 +170,21 @@
const std::vector<ArgVarRefs>& getArgVarsRefs() const;
+ store::Index* getCache() const;
+
+ void setCache(store::Index* aCache);
+
+ bool cacheResults() const;
+
PlanIter_t codegen(
CompilerCB* cb,
static_context* sctx,
const QueryLoc& loc,
std::vector<PlanIter_t>& argv,
AnnotationHolder& ann) const;
+
+ void
+ computeResultCaching(XQueryDiagnostics*) const;
};
=== modified file 'src/runtime/core/fncall_iterator.cpp'
--- src/runtime/core/fncall_iterator.cpp 2011-10-30 00:18:34 +0000
+++ src/runtime/core/fncall_iterator.cpp 2011-12-12 18:25:31 +0000
@@ -48,6 +48,11 @@
#include "util/string_util.h"
+#include "store/api/index.h"
+#include "store/api/store.h"
+#include "store/api/iterator_factory.h"
+#include "store/api/temp_seq.h"
+
#ifdef ZORBA_WITH_DEBUGGER
#include "debugger/debugger_commons.h"
@@ -93,7 +98,8 @@
thePlanState(NULL),
thePlanStateSize(0),
theLocalDCtx(NULL),
- thePlanOpen(false)
+ thePlanOpen(false),
+ theCache(0)
{
}
@@ -153,6 +159,7 @@
{
thePlan->reset(*thePlanState);
}
+ theCacheHits.clear();
}
@@ -198,6 +205,103 @@
/*******************************************************************************
********************************************************************************/
+bool UDFunctionCallIterator::isCached() const
+{
+ return theUDF->cacheResults();
+}
+
+/*******************************************************************************
+
+********************************************************************************/
+void UDFunctionCallIterator::createCache(
+ PlanState& planState,
+ UDFunctionCallIteratorState* state)
+{
+ store::Index_t lIndex = theUDF->getCache();
+ if (!lIndex && theUDF->cacheResults())
+ {
+ const signature& sig = theUDF->getSignature();
+
+ csize numArgs = theChildren.size();
+
+ store::IndexSpecification lSpec;
+ lSpec.theNumKeyColumns = numArgs;
+ lSpec.theKeyTypes.resize(numArgs);
+ lSpec.theCollations.resize(numArgs);
+ lSpec.theIsTemp = true;
+ lSpec.theIsUnique = true;
+ for (csize i = 0; i < numArgs; ++i)
+ {
+ lSpec.theKeyTypes[i] = sig[i]->getBaseBuiltinType()->get_qname().getp();
+ }
+ lIndex = GENV_STORE.createIndex(theUDF->getName(), lSpec, 0);
+ theUDF->setCache(lIndex.getp()); // cache the cache in the function itself
+ state->theCacheHits.reserve(theChildren.size());
+ }
+ state->theCache = lIndex.getp();
+}
+
+
+/*******************************************************************************
+
+********************************************************************************/
+bool UDFunctionCallIterator::probeCache(
+ PlanState& planState,
+ UDFunctionCallIteratorState* state,
+ store::Item_t& result,
+ std::vector<store::Item_t>& aKey) const
+{
+ if (!state->theCache)
+ return false;
+
+ store::IndexCondition_t lCond =
+ state->theCache->createCondition(store::IndexCondition::POINT_VALUE);
+
+ std::vector<store::Iterator_t>::iterator lIter
+ = state->theArgWrappers.begin();
+ for (; lIter != state->theArgWrappers.end(); ++lIter)
+ {
+ store::Iterator_t& argWrapper = (*lIter);
+ store::Item_t lArg;
+ if (argWrapper) // might be 0 if argument is not used
+ {
+ argWrapper->next(lArg); // guaranteed to have exactly one result
+ }
+ aKey.push_back(lArg);
+ lCond->pushItem(lArg);
+ }
+ store::IndexProbeIterator_t lCacheHit =
+ GENV_STORE.getIteratorFactory()->createIndexProbeIterator(state->theCache);
+ lCacheHit->init(lCond);
+ lCacheHit->open();
+ return lCacheHit->next(result);
+}
+
+
+/*******************************************************************************
+
+********************************************************************************/
+void UDFunctionCallIterator::insertCacheEntry(
+ UDFunctionCallIteratorState* state,
+ std::vector<store::Item_t>& aKey,
+ store::Item_t& aValue) const
+{
+ if (state->theCache)
+ {
+ std::auto_ptr<store::IndexKey> k(new store::IndexKey());
+ store::IndexKey* k2 = k.get();
+ k->theItems = aKey;
+ store::Item_t lTmp = aValue; // insert will eventually transfer the Item_t
+ if (!state->theCache->insert(k2, lTmp))
+ {
+ k.release();
+ }
+ }
+}
+
+/*******************************************************************************
+
+********************************************************************************/
void UDFunctionCallIterator::openImpl(PlanState& planState, uint32_t& offset)
{
UDFunctionCallIteratorState* state;
@@ -229,6 +333,11 @@
// the plan state (but not the state block) and dynamic context.
state->open(planState, theUDF);
+ // if the results of the function should be cached (prereq: atomic in and out)
+ // this functions stores an index in the dynamic context that contains
+ // the cached results. The name of the index is the name of the function.
+ createCache(planState, state);
+
// Create a wrapper over each subplan that computes an argument expr, if the
// associated param is actually used anywhere in the function body.
csize numArgs = theChildren.size();
@@ -301,6 +410,9 @@
{
try
{
+ std::vector<store::Item_t> lKey;
+ bool lCacheHit;
+
UDFunctionCallIteratorState* state;
DEFAULT_STACK_INIT(UDFunctionCallIteratorState, state, planState);
@@ -314,19 +426,29 @@
state->thePlanOpen = true;
}
- // Bind the args.
+ // check if there is a cache and the result is already in the cache
+ lCacheHit = probeCache(planState, state, result, lKey);
+
+ // if not in the cache, we bind the arguments to the function
+ if (!lCacheHit)
{
const std::vector<ArgVarRefs>& argsRefs = theUDF->getArgVarsRefs();
- std::vector<ArgVarRefs>::const_iterator argsRefsIte = argsRefs.begin();
- std::vector<ArgVarRefs>::const_iterator argsRefsEnd = argsRefs.end();
-
- std::vector<store::Iterator_t>::iterator argWrapsIte =
- state->theArgWrappers.begin();
-
- for (; argsRefsIte != argsRefsEnd; ++argsRefsIte, ++argWrapsIte)
+ const std::vector<store::Iterator_t>& argWraps = state->theArgWrappers;
+ for (size_t i = 0; i < argsRefs.size(); ++i)
{
- store::Iterator_t& argWrapper = (*argWrapsIte);
- const ArgVarRefs& argVarRefs = (*argsRefsIte);
+ const ArgVarRefs& argVarRefs = argsRefs[i];
+ store::Iterator_t argWrapper;
+ if (state->theCache)
+ {
+ std::vector<store::Item_t> lParam(1, lKey[i]);
+ state->theCacheHits.push_back(GENV_STORE.createTempSeq(lParam));
+ argWrapper = state->theCacheHits.back()->getIterator();
+ argWrapper->open();
+ }
+ else
+ {
+ argWrapper = argWraps[i];
+ }
ArgVarRefs::const_iterator argVarRefsIte = argVarRefs.begin();
ArgVarRefs::const_iterator argVarRefsEnd = argVarRefs.end();
@@ -343,25 +465,34 @@
}
}
-#ifdef ZORBA_WITH_DEBUGGER
- DEBUGGER_PUSH_FRAME;
-#endif
-
- while (consumeNext(result, state->thePlan, *state->thePlanState))
- {
-#ifdef ZORBA_WITH_DEBUGGER
- DEBUGGER_POP_FRAME;
-#endif
- STACK_PUSH(true, state);
-
-#ifdef ZORBA_WITH_DEBUGGER
- DEBUGGER_PUSH_FRAME;
-#endif
- }
-
-#ifdef ZORBA_WITH_DEBUGGER
- DEBUGGER_POP_FRAME;
-#endif
+ if (lCacheHit)
+ {
+ STACK_PUSH(true, state);
+ }
+ else
+ {
+#ifdef ZORBA_WITH_DEBUGGER
+ DEBUGGER_PUSH_FRAME;
+#endif
+ while (consumeNext(result, state->thePlan, *state->thePlanState))
+ {
+#ifdef ZORBA_WITH_DEBUGGER
+ DEBUGGER_POP_FRAME;
+#endif
+
+ insertCacheEntry(state, lKey, result);
+ STACK_PUSH(true, state);
+
+#ifdef ZORBA_WITH_DEBUGGER
+ DEBUGGER_PUSH_FRAME;
+#endif
+ }
+
+#ifdef ZORBA_WITH_DEBUGGER
+ DEBUGGER_POP_FRAME;
+#endif
+
+ }
STACK_END(state);
=== modified file 'src/runtime/core/fncall_iterator.h'
--- src/runtime/core/fncall_iterator.h 2011-10-30 00:18:34 +0000
+++ src/runtime/core/fncall_iterator.h 2011-12-12 18:25:31 +0000
@@ -69,6 +69,19 @@
the body. So, it is never the case that the arg expr will have more than one
consumers, and as a result we can bind all those V references to the same arg
wrapper.
+
+ theCache:
+ ---------
+ Is an Index which is set in the state if caching for the invoked function
+ should be done. The cache is owned by the UDF itself and shared across
+ all function invocations.
+
+ theCacheHits:
+ -------------
+ If caching is used, this vector contains the results of all arguments
+ of the function evaluation. It's used to bind the variables if the
+ cache didn't give a result in order to avoid duplicate evaluation of
+ the arguments.
********************************************************************************/
class UDFunctionCallIteratorState : public PlanIteratorState
{
@@ -79,6 +92,8 @@
dynamic_context * theLocalDCtx;
bool thePlanOpen;
std::vector<store::Iterator_t> theArgWrappers;
+ store::Index * theCache;
+ std::vector<store::TempSeq_t> theCacheHits;
UDFunctionCallIteratorState();
@@ -130,6 +145,8 @@
void setDynamic() { theIsDynamic = true; }
+ bool isCached() const;
+
void accept(PlanIterVisitor& v) const;
void openImpl(PlanState& planState, uint32_t& offset);
@@ -139,6 +156,22 @@
void closeImpl(PlanState& planState);
bool nextImpl(store::Item_t& result, PlanState& planState) const;
+
+protected:
+ void createCache(
+ PlanState& planState,
+ UDFunctionCallIteratorState* state);
+
+ bool probeCache(
+ PlanState& planState,
+ UDFunctionCallIteratorState* state,
+ store::Item_t& result,
+ std::vector<store::Item_t>& aKey) const;
+
+ void insertCacheEntry(
+ UDFunctionCallIteratorState* state,
+ std::vector<store::Item_t>& aKey,
+ store::Item_t& aValue) const;
};
=== modified file 'src/runtime/visitors/printer_visitor_impl.cpp'
--- src/runtime/visitors/printer_visitor_impl.cpp 2011-12-01 11:02:25 +0000
+++ src/runtime/visitors/printer_visitor_impl.cpp 2011-12-12 18:25:31 +0000
@@ -1209,7 +1209,23 @@
#undef TYPED_VAL_CMP
- PRINTER_VISITOR_DEFINITION (UDFunctionCallIterator)
+ void PrinterVisitor::beginVisit ( const UDFunctionCallIterator& a )
+ {
+ thePrinter.startBeginVisit("UDFunctionCallIterator", ++theId);
+ printCommons( &a, theId );
+ if (a.isCached())
+ {
+ thePrinter.addAttribute("cached", "true");
+ }
+ thePrinter.endBeginVisit( theId);
+ }
+
+ void PrinterVisitor::endVisit ( const UDFunctionCallIterator& )
+ {
+ thePrinter.startEndVisit();
+ thePrinter.endEndVisit();
+ }
+
PRINTER_VISITOR_DEFINITION (ExtFunctionCallIterator)
PRINTER_VISITOR_DEFINITION (FnBooleanIterator)
PRINTER_VISITOR_DEFINITION (OrIterator)
=== modified file 'src/store/api/index.h'
--- src/store/api/index.h 2011-10-11 12:11:48 +0000
+++ src/store/api/index.h 2011-12-12 18:25:31 +0000
@@ -415,6 +415,18 @@
* Returns all keys stored in this index
*/
virtual KeyIterator_t keys() const = 0;
+
+ /**
+ * Insert the given item in the value set of the given key. If the key is not
+ * in the index already, then the key itself is inserted as well. Return true
+ * if the key was already in the index, false otherwise
+ * The index wil take the ownership of the key if it was not already in the
+ * index.
+ *
+ * @error ZDDY0035 if a key with more than one item is inserted into
+ * a general index
+ */
+ virtual bool insert(store::IndexKey*& key, store::Item_t& item) = 0;
};
=== modified file 'src/store/naive/simple_index_general.cpp'
--- src/store/naive/simple_index_general.cpp 2011-11-15 18:48:54 +0000
+++ src/store/naive/simple_index_general.cpp 2011-12-12 18:25:31 +0000
@@ -236,6 +236,20 @@
/******************************************************************************
*******************************************************************************/
+bool GeneralIndex::insert(store::IndexKey*& key, store::Item_t& value)
+{
+ if (key->size() != 1)
+ {
+ RAISE_ERROR_NO_LOC(zerr::ZDDY0035_INDEX_GENERAL_INSERT,
+ ERROR_PARAMS(getName()->getStringValue()));
+ }
+ return insert((*key)[0], value);
+}
+
+
+/******************************************************************************
+
+*******************************************************************************/
bool GeneralIndex::insert(store::Item_t& key, store::Item_t& node)
{
bool lossy = false;
=== modified file 'src/store/naive/simple_index_general.h'
--- src/store/naive/simple_index_general.h 2011-11-15 18:48:54 +0000
+++ src/store/naive/simple_index_general.h 2011-12-12 18:25:31 +0000
@@ -174,6 +174,8 @@
bool insert(store::Item_t& key, store::Item_t& node);
+ bool insert(store::IndexKey*& key, store::Item_t& value);
+
virtual bool remove(const store::Item_t& key, store::Item_t& item, bool all) = 0;
};
=== modified file 'src/store/naive/simple_store.cpp'
--- src/store/naive/simple_store.cpp 2011-11-02 17:19:09 +0000
+++ src/store/naive/simple_store.cpp 2011-12-12 18:25:31 +0000
@@ -555,6 +555,9 @@
store::Iterator* aSourceIter,
ulong aNumColumns)
{
+ if (!aSourceIter)
+ return;
+
store::Item_t domainItem;
store::IndexKey* key = NULL;
=== added file 'test/rbkt/ExpCompilerResults/IterPlan/zorba/udf/udf-fib-rec.iter'
--- test/rbkt/ExpCompilerResults/IterPlan/zorba/udf/udf-fib-rec.iter 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpCompilerResults/IterPlan/zorba/udf/udf-fib-rec.iter 2011-12-12 18:25:31 +0000
@@ -0,0 +1,42 @@
+Iterator tree for main query:
+<UDFunctionCallIterator cached="true">
+ <SingletonIterator value="xs:integer(100)"/>
+</UDFunctionCallIterator>
+
+Iterator tree for local:fib:
+<flwor::FLWORIterator>
+ <ForVariable name="n">
+ <LetVarIterator varname="n"/>
+ </ForVariable>
+ <ReturnClause>
+ <IfThenElseIterator>
+ <TypedValueCompareIterator_INTEGER>
+ <ForVarIterator varname="n"/>
+ <SingletonIterator value="xs:integer(0)"/>
+ </TypedValueCompareIterator_INTEGER>
+ <SingletonIterator value="xs:integer(0)"/>
+ <IfThenElseIterator>
+ <TypedValueCompareIterator_INTEGER>
+ <ForVarIterator varname="n"/>
+ <SingletonIterator value="xs:integer(1)"/>
+ </TypedValueCompareIterator_INTEGER>
+ <SingletonIterator value="xs:integer(1)"/>
+ <SpecificNumArithIterator_AddOperation_INTEGER>
+ <UDFunctionCallIterator cached="true">
+ <SpecificNumArithIterator_SubtractOperation_INTEGER>
+ <ForVarIterator varname="n"/>
+ <SingletonIterator value="xs:integer(1)"/>
+ </SpecificNumArithIterator_SubtractOperation_INTEGER>
+ </UDFunctionCallIterator>
+ <UDFunctionCallIterator cached="true">
+ <SpecificNumArithIterator_SubtractOperation_INTEGER>
+ <ForVarIterator varname="n"/>
+ <SingletonIterator value="xs:integer(2)"/>
+ </SpecificNumArithIterator_SubtractOperation_INTEGER>
+ </UDFunctionCallIterator>
+ </SpecificNumArithIterator_AddOperation_INTEGER>
+ </IfThenElseIterator>
+ </IfThenElseIterator>
+ </ReturnClause>
+</flwor::FLWORIterator>
+
=== added file 'test/rbkt/ExpQueryResults/zorba/udf/udf-fib-rec.xml.res'
--- test/rbkt/ExpQueryResults/zorba/udf/udf-fib-rec.xml.res 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpQueryResults/zorba/udf/udf-fib-rec.xml.res 2011-12-12 18:25:31 +0000
@@ -0,0 +1,1 @@
+3736710778780434371
=== added file 'test/rbkt/Queries/zorba/udf/udf-fib-rec.xq'
--- test/rbkt/Queries/zorba/udf/udf-fib-rec.xq 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/udf/udf-fib-rec.xq 2011-12-12 18:25:31 +0000
@@ -0,0 +1,8 @@
+declare function local:fib($n as xs:integer) as xs:integer
+{
+ if ($n eq 0) then 0
+ else if ($n eq 1) then 1
+ else local:fib($n - 1) + local:fib($n - 2)
+};
+
+local:fib(100)
Follow ups
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
Re: [Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Markos Zaharioudakis, 2011-12-12
-
Re: [Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Markos Zaharioudakis, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
Re: [Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Zorba Build Bot, 2011-12-12
-
Re: [Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Matthias Brantner, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Matthias Brantner, 2011-12-12
-
[Merge] lp:~zorba-coders/zorba/caching into lp:zorba
From: Matthias Brantner, 2011-12-12