Query with Python ================= Pre-Requisites -------------- This lesson expects that you read first the sections * Overview * Data Model The hands-on activity is expected to be done in this Lesson. Therefore, as you read the sections above, simply focus on understanding the concepts. For this lesson we will use the Python interface to MongoDB: `pymongo`_. In order to interact with pymongo, you first start the python interpreter with :: python and then, from the Python prompt: :: >>> you type: :: >>> import pymongo Then you create a connection with the pymongo interface by typing :: >>> cn = pymongo.Connection() and now you can see what databases are available, with: :: >>> print cn.database_names() This will respond with a list that may look like the following: :: [u'book', u'entertainment', u'imdb', u'test', u'local'] We can select a specific database called "entertainment" with the command: :: >>> db = cn['entertainment'] and we can take a look a the collections inside this database with: :: >>> print db.collection_names() to see the number of documents in a given collection we can do: :: >>> print db.movies.count() Putting all together we have: :: >>> import pymongo >>> cn = pymongo.Connection() >>> print cn.database_names() >>> db = cn['entertainment'] >>> print db.collection_names() >>> print db.movies.count() Query a Movie ------------- We can now query entries in the "movies" collection of the "entertainment" database. The simplest queries can be performed with the "find_one" command. This command will return the first document in the collection that satisfies a criteria the you provide. Let's first put the information about the movie in a variable called "where": :: >>> where = {"year":2008} then pass this variable "where" as the argument to the "find_one" function: :: >>> db.movies.find_one(where) this will result something similar to: :: {u'title': u'Vicky Cristina Barcelona', u'starts': [u'Rebecca Hall', u'Scarlett Johansson', u'Javier Bardem'], u'directors': [u'Woody Allen'], u'writers': [u'Woody Allen'], u'year': 2008, u'_id': ObjectId('50bba5f19171830ef0000000')} In this case, what we are doing is asking MongoDB to return the first document it finds in the "movies" collection, that satisfies the criteria of having its "year" property be equal to "2008". You can imagine that in a movies collection, there would be many entries that satisfy this condition. To get a full list of entries we can use the "find" function instead of the "find_one" function. The search is very similar, but now we store the output in a "results" variable: :: >>> results = db.movies.find(where) we can check how many entries be got, by using the "count" function :: >>> print results.count() and now we can print out each one of the entries in this results variable, by using the expression :: >>> for movie in results: ... print movie that will return a list of movies whose years are equal to 2008. We can be more selective on the number of properties that we want to see from this collection of movies. For example, we could ask to see only the "title" and "directors". To do this, we put that information in another variable, that here we decided to call "what": :: >>> what = {"title":1,"directors":1} and pass this "what" variable as the second argument of the "find" function :: >>> results = db.movies.find(where,what) and once again, print all the entries in the results :: >>> for movie in results: ... print movie Note that the "1" in front of the properties in the "what" variable is stating what fields we want to see. You may have noticed that Mongo is always returning and "_id" field. We can silence it by doing: :: >>> what = {"title":1,"directors":1,"_id":0} note that we are placing a "0" in front of "_id", to silence that field, while we are placing a "1" in front of "title" and "directors" in order to include those two properties in the output. To make more complex searches, we could include more conditions in the queries. For example we can search for all the movies by a given director after a certain year. :: >>> where = {"directors":"Woody Allen","year":{$gt:1990}} This will select of the movies directed by Woody Allen after 1990. Combined, the process will look like: :: >>> what = {"title":1,"directors":1,"_id":0,"year":1} >>> where = {"directors":"Woody Allen","year":{"$gt":1990}} >>> results = db.movies.find(where,what) >>> for movie in results: ... print movie Finally, we can sort that output before printing it, by doing: :: >>> for movie in results.sort("year"): ... print movie This expression will give us the list of movies sorted by their "year" in ascending order. We can transform it to do a descending order with the additional argument: :: >>> for movie in results.sort("year",pymongo.DESCENDING): ... print movie This concludes our tour of basic queries. Exercises ~~~~~~~~~ * Search for your favorite movie * Search for all the movies by the director of your favorite movie * Find how many movies are in the collection from before the year 2000 .. _pymongo: http://api.mongodb.org/python/current/#