Working with lists of pages
Templates which render a list of content files (e.g. a list of blog posts or
pages belonging to a category) will need to filter or sort MDCONTENT
accordingly. In order to make this easier, MDCONTENT
is wrapped in a list-like
object called MDContentList
, which has the following methods:
General searching/filtering
Each of the following methods returns a new MDContentList
containing those
entries for which the predicate (pred
) is True.
-
match_entry(self, pred)
: Thepred
(i.e. predicate) is a callable which receives the full information on each entry in theMDContentList
and returns True or False. -
match_ctx(self, pred)
: Thepred
receives the context for each entry and returns a boolean. -
match_page(self, pred)
: Thepred
receives thepage
object for each entry and returns a boolean. -
match_doc(self, pred)
: Thepred
receives the markdown body for each entry and returns a boolean. -
url_match(self, url_pred)
: Thepred
receives theurl
(relative tohtdocs
) for each entry and returns a boolean. -
path_match(self, src_pred)
: Thepred
receives the path to the source document for each entry and returns a boolean.
Specialized searching/filtering
All of these return a new MDContentList
object (at least by default).
-
posts(self, ordered=True)
: Returns a newMDContentList
with those entries which are blog posts. In practice this means those with markdown sources in theposts/
orblog/
subdirectories or those which have apage.type
of "post", "blog", "blog-entry" or "blog_entry". Normally ordered by date (newest first), but this can be turned off by settingordered
to False. -
not_posts(self)
: Returns a newMDContentList
with "pages", i.e. those entries which are not blog posts. -
has_slug(self, sluglist)
,has_id(self, idlist)
: Entries with specific slugs/ids. -
in_date_range(self, start, end, date_key='DATE')
: Posts/pages with a date betweenstart
andend
. The key for the date field can be specifed usingdate_key
. Unless the value fordate_key
is eitherDATE
orMTIME
, then the key is looked for in thepage
variables for the entry. -
has_taxonomy(self, haystack_keys, needles)
: A general search for entries belonging to a taxonomy group, such as category, tag, section or type. Theyhaystack_keys
are thepage
variables to examine whileneedles
is a list of the values to look for in the values of those variables. A string value forneedles
is treated as a one-item list. The search is case-insensitive. -
in_category(self, catlist)
: A shortcut method forself.has_taxonomy(['category', 'categories'], catlist)
-
has_tag(self, taglist)
: A shortcut method forself.has_taxonomy(['tag', 'tags'], taglist)
. -
in_section(self, sectionlist)
: A shortcut method forself.has_taxonomy(['section', 'sections'], sectionlist)
. -
get_used_taxonomies(self)
: Get a list of all known taxonomies that are actually used by items in this MDContentList (i.e. content files). These may be of two types: (1) the standard taxonomies tags, sections, categories and authors; and (2) anything defined as aTAXONOMY
in the frontmatter of a page. Returns a list of dicts with the keystaxon
,name
,name_singular
andname_plural
. If the taxonomy belongs to the latter group, thenorder
,list_url
,item_url_pattern
andpage_id
will be present as well, andname_singular
/name_plural
may be empty. If a standard taxonomy (e.g. tags) has been handled as a content pageTAXONOMY
, then the latter type takes precedence (i.e. the standard one is omitted from the list). -
group_by(self, pred, normalize=None, keep_empty=False)
: Group items in an MDContentList using a given criterion. Parameters:pred
is a callable receiving a content item and returning a string or a list of strings. For convenience,pred
may also be specified as a string and is then interpreted as the value of the namedpage
variable, e.g.category
;normalize
is an optional callable that transforms the grouping values, e.g. by truncating and lowercasing them;keep_empty
should be set to True when the content items whose predicate evaluates to the empty string are to be included in the result, since they otherwise will be omitted. Returns a dict whose keys are strings and whose values areMDContentList
instances. -
taxonomy_info(self, keys, order='count', tostring=None)
: Returns a list of dicts, where each dict corresponds to the slugified value of any of the keys inkeys
. The keys in the dict arename
,slug
,forms
(different forms ofname
that appear in the result, e.g. upper/lowercase),count
, anditems
(an MDContentList object).tostring
, if present, is a callable that changes non-string and non-list values into strings for the purposes of grouping. Shorthand forms for common taxonomy types are available, namelyget_categories(self, order='name')
,get_tags(self, order='name')
,get_sections(self, order='name')
, andget_authors(self, order='name', tostring=None)
. These look for both singular and plural forms of the given keys, e.g.['tag', 'tags']
forget_tags()
. -
page_match(self, match_expr, ordering=None, limit=None)
: This is actually quite a general matching method but does not require the caller to pass a predicate callable to it, which means that it can be employed in more varied contexts than the general methods described in the last section. Amatch_expr
contains the filtering specification. It will be described further below. Theordering
parameter, if specified, should be eithertitle
,slug
,url
ordate
, with an optional-
in front to indicate reverse ordering. Thedate
option forordering
may be followed by the preferred frontmatter date field after a colon, e.g.ordering='-date:modified_date'
for a list with the most recently changed files at the top. Thelimit
, if specified, obviously indicates the maximum number of pages to return. -
page_match_sql()
,get_db()
,get_db_columns()
– see "Searching/filtering using SQL" below.
A match_expr
for page_match()
is either a dict or a list of dicts. If it is
a dict, each page in the result set must match each of the attributes specified
in it. If it is a list of dicts, each page in the result set must match at least
one of the dicts (i.e., the returned result set contains the union of all
matches from all dicts in the list). When a string or regular expression match
is being performed in this process, it will be case-insensitive. The supported
attributes (i.e. dict keys) are as follows:
title
: A regular expression which will be applied to the page title.slug
: A regular expression which will be applied to the slug.id
: A string or list of strings (one of) which must match the page id exactly.url
: A regular expression which will be applied to the target URL.path
: A regular expression which will be applied to the path to the markdown source file (i.e. thesource_file_short
field).doc
: A regular expression which will be applied to the body of the markdown source document.date_range
: A list containing two ISO-formatted dates and optionally a date key (DATE
by default) - see the description ofin_date_range()
above.has_attrs
: A list of frontmatter variable names. Matching pages must have a non-empty value for each of them.attrs
: A dict where each key is the name of a frontmatter variable and the value is the value of that attribute. If the value is a string, it will be matched case-insensitively. All key-value pairs must match.has_tag
,in_section
,in_category
: The values are lists of tags, sections or categories, respectively, at least one of which must match (case-insensitively). See the methods with these names above.is_post
: If set to True, will match if the page is a blog post; if set to False will match if the page is not a blog post.exclude_url
: The page with this URL should be omitted from the results (normally the calling page).
Searching/filtering using SQL
An MDContentList
has three methods for examining the content using an SQLite
in-memory database:
-
get_db(self)
: Builds a SQLite database containing a single table,content
, whose structure is described below. Returns a connection to this database which can then be worked with using normal sqlite3/DBAPI methods. The database has a locale-sensitive collation calledlocale
(which applieslocale.strxfrm
) and a custom functioncasefold
(which simply applies the Pythoncasefold
string method). The row factory issqlite3.Row
, so row fields can be read using either column names or integer indices. -
get_db_columns(self)
: Returns a simple list of the columns in thecontent
table. -
page_match_sql(self, where_clause=None, bind=None, order_by=None, limit=None, offset=None, raw_sql=None, raw_result=False, first=False)
: Eitherwhere_clause
orraw_sql
must be specified. In either case, ifbind
is specified, the bind variables there will be applied to the SQL upon execution. Iforder_by
(a string),limit
oroffset
(integers) are specified, they will be appended to the SQL before executing it against the database connection. The result will be aMDContentList
unlessraw_result
is True, in which case it is a cursor object. (Ifraw_result
is False butraw_sql
is supplied, the column list in the SQL select statement must includesource_file
so as to permit the construction of an appropriateMDContentList
). Iffirst
is True, only the first item from the results is returned (or None, if the results are empty).
The content
table constructed by get_db()
always contains the columns
source_file
, source_file_short
, url
target
, template
, MTIME
, DATE
,
doc
, and rendered
. In addition, it contains each page
metadata field that
appears in any of the entries in the MDContentList
in question. These will be
added as columns with the page_
prefix; for instance, the title
field will
become page_title
.
It should be noted that all page
fields added to the table will have to match
the regular expression ^[a-z]\w*$
. Thus, any metadata field with
a key that is all uppercase, titlecased, or contains non-word characters
(such as hyphens) will be omitted. Also, field names are case-sensitive in the
raw metadata, but case-insensitive in the database table, so inconsistently
capitalized field names may lead to unexpected results.
A field value that is not either string, integer, float, boolean, date, datetime,
or None, will be serialized using json.dumps()
with ensure_ascii
set to False
(for easier utf-8 matching). Dates and datetimes are stringified. Booleans will
be represented as 1 or 0.
Sorting
All of these return a new MDContentList
object with the entries in the
specified order.
-
sorted_by(self, key, reverse=False, default_val=-1)
: A general sorting method. Thekey
is thepage
variable to sort on,default_val
is the value to assume if there is no such variable present in the entry, whilereverse
indicates whether the sort is to be descending (True) or ascending (False, the default). -
sorted_by_date(self, newest_first=True, date_key='DATE')
: Sorting by date, newest first by default. The date key to sort on can be specified if desired. -
sorted_by_title(self, reverse=False)
: Sorting bypage.title
, ascending by default.
Pagination
paginate(self, pagesize=5, context=None)
: Divides theMDContentList
into chunks of sizepagesize
and returns a tuple consisting of the chunks and a list ofpage_urls
(one for each page, in order). If an appropriate template context is provided, pages 2 and up will be written to the webroot output directory to destination files whose names are based upon the URL for the first page (and the page number, of course). Without the context, thepage_urls
will be None. It is the responsibility of the calling template to check the_page
variable for the current page to be rendered (this defaults to 1). Each iteration will get all chunks and must use this variable to limit itself appropriately.
Typical usage of paginate()
:
<%
posts = MDCONTENT.posts()
chunks, page_urls = posts.paginate(5, context)
curpage = context.get('_page', 1)
%>
% for post in chunks[curpage-1]:
${ show_post(post) }
% endfor
% if len(chunks) > 1:
${ prevnext(len(chunks), curpage, page_urls) }
% endif
Render to an arbitrary file
def write_to(self, dest, context, extra_kwargs=None, template=None)
: Calls a template with theMDContentList
inself
as the value ofCHUNK
and write the result to the file named indest
. The file is of course relative to the webroot. Any directories are created if necessary. Thetemplate
is by default the calling template whileextra_kwargs
may be added if desired.
Typical usage of write_to()
:
<%
if not CHUNK:
for tag in tags:
tagged = MDCONTENT.has_tag([tag])
if not tagged:
continue # avoid potential infinite loop!
outpath = '/tags/' + slugify(tag) + '/index.html'
tagged.write_to(outpath, context, {'TAG': tag})
%>
% if CHUNK:
${ list_tagged_pages(TAG, CHUNK) }
% else:
${ list_tags() }
% endif