|
This is a pre-print extract from the forthcoming O'Reilly book Lisp Outside the Box. Contents are subject to change as the book's production progresses. Feedback is most welcome, either in private by or in public by responding to the blog entry which announced this chapter. Table of Contents This chapter builds on the last one and continues our exploration of AllegroCache. We’ve already covered in some detail the various ways you can get information into the database and modify it once it’s there; now we really ought to look at how to get your data out. After that we’ll move on to a brief trot through some more advanced topics including database administration. We’ve already met the macro
(let ((sentences nil))
(doclass (sentence 'sentence)
(push (sentence-text sentence) sentences))
sentences)
=>
("We have to look at how to get your data out."
"Then we'll take a brief look at some more advanced topics.")
Here’s another way of doing it. CL-USER(31): TipWhen you don’t need the cursor any more, you should call
ExerciseWorking from the example above, write a function which uses a
cursor to collect all the instances of a class. How will you
guarantee that At first sight, all we’ve done is add unnecessary baggage:
it seems unlikely that your cursor function will look as elegant as
using
CL-USER(41):
We’ll come back to this and its uses later. The other gain is more immediately noticeable: we can pass the cursor around as a Lisp value.
(defmacro with-class-cursor ((fun-var class) &body body)
(let ((cursor-var (gensym "CLASS-CURSOR-")))
`(let ((,cursor-var (create-class-cursor ,class)))
(unwind-protect
(catch ',cursor-var
(flet ((,fun-var ()
(or (next-class-cursor ,cursor-var)
(throw ',cursor-var nil))))
,@body))
(free-class-cursor ,cursor-var)))))
(with-class-cursor (next-sentence 'sentence)
(loop (print (sentence-text (next-sentence)))))
ExerciseThe symbol stored in We can now go one step beyond
(make-instance 'menu-item
:text "View next sentence"
:callback (lambda ()
(view-text (sentence-text (next-sentence)))))
CautionIf there are no instances of your class in the cache,
ExerciseModify An index is a simple way of filtering and ordering the results of a search. We have to flag the slot(s) which we want to be indexed; as the following example shows we needn’t get around to doing this until after creating the objects concerned. ;; Modify previous class definition, specifying :INDEX :ANY-UNIQUE for the ;; TEXT slot. (defclass sentence () ((text :initarg :text :accessor sentence-text :index :any-unique) (next :initform nil :accessor sentence-next)) (:metaclass persistent-class)) ;; Write the new index to the database. This will update existing instances ;; and we can keep on working with them. (commit) Note that when a slot is indexed as Let’s try out some searches. Give the function
(retrieve-from-index 'sentence 'text "Hello, World.")
=>
NIL
(retrieve-from-index 'sentence
'text "We have to look at how to get your data out.")
=>
#<SENTENCE oid: 1011, ver 6, trans: 19, not modified @ #x214452fa>
We can put this index to use, building a set of objects which
links Lisp symbols to the sentences which refer to them and
ensuring that the sentence references are unique (see the call to
;; Warning! I'm only showing a small part of my data here.
;; The full list is available via http://lisp-book.org/
(defparameter *chapter-13-links*
'(("doclass" .
"It's the macro doclass which has the following syntax.")
("doclass" .
"Doclass returns nil; you can use (return ...) to terminate the loop
early and return more interesting values.")
("return" .
"Doclass returns nil; you can use (return ...) to terminate the loop
early and return more interesting values.")
...
))
;; This function uses the index on TEXT for the class SENTENCE.
(defun make-links (links)
(loop for (term . sentence-text) in links do
(let ((sentence (or ;; Is this sentence already in the cache?
(retrieve-from-index 'sentence
'text sentence-text)
;; If not, make a new one.
(make-instance 'sentence
:text sentence-text))))
;; Make a persistent object linking the term ("doclass", etc) to
;; the sentence.
(make-instance 'link
:term term
:sentence sentence)))
;; End of the loop - commit the changes.
(commit))
;; Persistent class indexed on slot named TERM.
(defclass link ()
((term :reader link-term :initarg :term :index :any)
(sentence :reader link-sentence :initarg :sentence))
(:metaclass persistent-class))
(make-links *chapter-13-links*)
ExerciseIn We use the index which we defined in the class Recall from our raw data ( (retrieve-from-index 'link 'term "doclass" :all t) => (#<LINK oid: 2367, ver 5, trans: 101, not modified @ #x21281922> #<LINK oid: 2365, ver 5, trans: 101, not modified @ #x2128190a>) (mapcar 'link-sentence *) => (#<SENTENCE oid: 2366, ver 6, trans: 101, not modified @ #x21281742> #<SENTENCE oid: 2364, ver 6, trans: 101, not modified @ #x2128172a>) (mapcar 'link-sentence (retrieve-from-index 'link 'term "return" :all t)) => (#<SENTENCE oid: 2366, ver 6, trans: 101, not modified @ #x21281742>) ExercisePersistent classes can have any number of indexed slots. Try out a class with two indexes. Associated with every object in the cache there is a number
which serves as a unique identifier. Output generated by the
default CL-USER(51): Most of the search functions described in this chapter take a
ExerciseThe Allegro CL function ExerciseIn the example call to Object identifiers are a valuable debugging resource. Bear this
in mind if you define Exercise
ExerciseSuppose you know an object’s identifier but not its class.
Use a class from which all objects are bound to inherit and the
function The function
(mapcar 'link-term (retrieve-from-index-range 'link 'term "c" "d"))
=>
("cl:open" "close-database" "close-database" "commit" "commit" "commit"
...)
For both the filtering and the sorting operations, numbers
(integers and floats in increasing numerical order) come before
strings. The target slot may take “other” values in
which case it’ll be sorted ahead of both numbers and strings.
With one exception such other values may not be specified for the
(retrieve-from-index-range 'link 'term nil nil) returns a list of every Alternatively you can use an index
cursor and combine the filtering and sorting of indexes with
the flexibility which we met before with class cursors. Create one
with ExerciseWithout looking at the AllegroCache Reference Manual, use a
cursor to implement In this context an expression allows
you to specify either a value or a range of values for a given
slot, or to make logical combinations of other expressions using
Purely to illustrate these features, we’ll add another
slot to the class (defclass link () ((term :reader link-term :initarg :term :index :any-unique) (sentence :reader link-sentence :initarg :sentence) (text :accessor link-text)) (:metaclass persistent-class)) (doclass (x 'link) (setf (link-text x) (sentence-text (link-sentence x)))) (commit) To find all
(let* ((expression '(and (= term "rollback")
(:range text "A" "B")))
(cursor (create-expression-cursor 'link expression))
(links nil))
(loop (let ((link (or (next-index-cursor cursor)
(return))))
(push link links)))
(free-index-cursor cursor)
(mapcar 'link-text links))
=>
("Any uncommitted changes will be dropped and the objects concerned rewound
to their state at the last rollback or commit."
"Alternatively, you can call rollback and all your changes (since the last
commit or rollback) will be abandoned.")
As this example shows, expression cursors work on slots which haven’t been indexed. However running on indexed slots is faster and you should order your expressions so that indexed values are tested first. (Why?) In the last chapter we glossed over AllegroCache’s answer to hash tables, and how it stores non-persistent objects. Let’s come back to these now. Think of an AllegroCache map as a
persistent hash table. To create a map, make an instance of the
class
(let ((cursor (create-class-cursor 'link))
(map (make-instance 'ac-map-range)))
;; Use the cursor to walk over all instances of LINK and
;; insert them into the map.
(loop for link = (next-class-cursor cursor)
while link
do
(push (link-sentence link)
(map-value map (link-term link))))
(free-class-cursor)
;; You should commit the map and its values before using it.
(commit)
(values (map-count map)
;; What are the sentences linked by "persistent-class"?
(map-value map "persistent-class")
;; Return one of the terms and count how many times it occurs.
(block nil (map-map (lambda (term sentences)
(return (cons term (length sentences))))
map))))
=>
31
(#<SENTENCE oid: 2354, ver 6, trans: 102, not modified @ #x21269462>)
("*allegrocache*" . 2)
TipLike Maps also support Although it’s easiest to work with CLOS instances which are persistent, you can also add non-persistent objects to your cache. There are three hoops through which you have to jump to make this work.
For example, suppose we have the following non-persistent class
definition:
(defmethod encode-object ((self word))
(loop for slot in '(text id previous next)
collect (slot-value self slot)))
(defmethod decode-object ((self word) values)
(let ((word (make-instance (class-of self))))
(loop for slot in '(text id previous next)
as value in values
do
(setf (slot-value word slot) value))
word))
ExerciseWhy not just ExerciseWhat might you do about encoding a ExerciseSuppose instead that So why not just make everything persistent? Playing devil’s advocate,
There are two outstanding topics to cover here: database administration and alternative approaches to persistence. Before we get to these, we need to issue: AllegroCache connections are not thread-safe: the consequences are undefined (but
probably not good) if two threads “use the same
connection” at the same time. Considering the space of all
libraries, this problem is not altogether uncommon and will hit you
from time to time whether or not they were written in Lisp. We
haven’t covered multi-threaded use of Lisp yet; it’s a
major topic and the subject of the next chapter. At the end of that
we’ll be able to come back briefly to AllegroCache and deal
with this difficulty. For now, note that “using a
connection” includes: creating, deleting, or accessing the
slots of, a persistent object; using an index; and calling
We don’t have space to go into much detail here. This
section only summarises the Adminstrator’s Guide and parts of
the Reference Manual, both of which you’ll find by visiting
the AllegroCache website and clicking on
The first point to make is that the AllegroCache database is implemented using structures known as b-trees. Think of a b-tree as a large, efficient, sorted, and (in this case) file-based hash table whose keys and values are arrays of numbers[2]. In case you want to work with b-trees directly (not something you should ever need to do with an AllegroCache database), documentation for the API is accessible via the URL above. The b-trees live in files of type Another use for log files is rebuilding the database after an upgrade. This is straightforward and full instructions are given in the Adminstrator’s Guide. Finally, two points about efficiency. The first is that you can
use the accessor Now that we’ve covered one approach to persistence in some detail, it’s worth glancing at a couple more. These are both open-source, free to use, and not commercially supported. Arthur Lemmens’s Rucksack, released under an MIT-style license, runs on a variety of Lisp implementations: Allegro, LispWorks, SBCL, Clozure CL. The documentation is less thorough than that of AllegroCache: external symbols have documentation strings and there’s a short tutorial to get you started. General coding principles should be recognizable now that you’ve seen how to drive AllegroCache and although it has fewer high-level utilities you should be able to see from the following that it wouldn’t be hard to concoct your own:
(defun retrieve-from-index (class slot value)
;; Macro with-rucksack guarantees to close connection on the way out
(with-rucksack (*rucksack* *rs-directory*)
;; Macro with-transaction attempts a COMMIT on exit and guarantees
;; that if the COMMIT fails it'll perform a ROLLBACK instead.
(with-transaction ()
;; Even without calling (documentation 'rucksack-map-slot 'function)
;; it should be clear what this call does.
(rucksack-map-slot *rucksack* class slot
(lambda (instance)
(return-from retrieve-from-index
instance))
:equal value)))
nil)
ExerciseWhat does the above call to CautionA notable weakness with Rucksack is that it does not support an equivalent to AllegroCache’s client/server mode. You can access the Rucksack repository, documentation, a talk about how it’s implemented, and a users’ mailing list by visiting:
Also worth a visit is Elephant which runs on SBCL, Allegro, LispWorks, Clozure CL and CMUCL. Note that it’s licensed under the more restrictive LLGPL (see Appendix B). The implementation isn’t 100% Lisp: it uses a serializer written in C and you have a choice of connecting to one of Berkeley DB, SQLite and PostgreSQL for the backing store. Once more this is an API which should look familiar and this time there’s extensive documentation to go with it. Elephant supports multiple connections and although its authors believe this will work over a network provided the underlying database supports it, they haven’t heard of anyone trying that out yet. The Elephant project homepage is at:
and you’ll find a comparison between Rucksack and Elephant here:
We conclude our look into persistence for Lisp with a note about more traditional database interfaces. Some of these are driven by sending SQL control strings to a database connection object and receiving lists of strings in return:
(sql "select term from link where term like 'c%';" :db *db-connection*)
=>
(("cl:open") ("close-database") ("close-database") ("commit") ("commit")
...)
;; Second return value is a list of "headings"
("term")
ExerciseWhat does the function Other interfaces also support an object-oriented approach. The invocations are reminiscent of SQL but the results, as for queries in the persistence libraries - are lists of CLOS objects.
(select 'link
:where [like [slot-value 'link 'term] "c%"])
=>
((#<db-instance LINK 8067092>) (#<db-instance LINK 8069536>)
(#<db-instance LINK 8069176>) ...)
For more details, see the Allegro ODBC chapter in the ACL documentation, Common SQL in either the LispWorks documentation or my own tutorial on the subject at:
and finally—open-source and similar to the LispWorks implementation—CLSQL at:
Using any of these SQL interfaces has the advantage that you can share your data with other applications very easily. On the other hand you’ll lose at least some of the power which tightly integrated persistence can offer you. As to which of these will give you the greater benefit: this is between you and your project requirements. That’s it for persistence. It’s been useful as an example of a language extension which comes “inside the box” for just one implementation, although it’s available for others—arguably in a weaker form—under open-source licenses. It’s also raised questions which bring us into contact with two further, proprietary but universal libraries.
These questions will the answered in the next two chapters. ExerciseWhat does happen to non-persistent objects? |