Working with lists of pages
Templates which render a list of content files (e.g. a list of blog posts or
pages belonging to a category) will need to filter or sort MDCONTENT
accordingly. In order to make this easier, MDCONTENT is wrapped in a list-like
object called MDContentList, which has the following methods:
General searching/filtering
Each of the following methods returns a new MDContentList containing those
entries for which the predicate (pred) is True.
-
match_entry(self, pred): Thepred(i.e. predicate) is a callable which receives the full information on each entry in theMDContentListand returns True or False. -
match_ctx(self, pred): Thepredreceives the context for each entry and returns a boolean. -
match_page(self, pred): Thepredreceives thepageobject for each entry and returns a boolean. -
match_doc(self, pred): Thepredreceives the markdown body for each entry and returns a boolean. -
url_match(self, url_pred): Thepredreceives theurl(relative tohtdocs) for each entry and returns a boolean. -
path_match(self, src_pred): Thepredreceives the path to the source document for each entry and returns a boolean.
Specialized searching/filtering
All of these return a new MDContentList object (at least by default).
-
posts(self, ordered=True): Returns a newMDContentListwith those entries which are blog posts. In practice this means those with markdown sources in theposts/orblog/subdirectories or those which have apage.typeof "post", "blog", "blog-entry" or "blog_entry". Normally ordered by date (newest first), but this can be turned off by settingorderedto False. -
not_posts(self): Returns a newMDContentListwith "pages", i.e. those entries which are not blog posts. -
has_slug(self, sluglist),has_id(self, idlist): Entries with specific slugs/ids. -
in_date_range(self, start, end, date_key='DATE'): Posts/pages with a date betweenstartandend. The key for the date field can be specifed usingdate_key. Unless the value fordate_keyis eitherDATEorMTIME, then the key is looked for in thepagevariables for the entry. -
has_taxonomy(self, haystack_keys, needles): A general search for entries belonging to a taxonomy group, such as category, tag, section or type. Theyhaystack_keysare thepagevariables to examine whileneedlesis a list of the values to look for in the values of those variables. A string value forneedlesis treated as a one-item list. The search is case-insensitive. -
in_category(self, catlist): A shortcut method forself.has_taxonomy(['category', 'categories'], catlist) -
has_tag(self, taglist): A shortcut method forself.has_taxonomy(['tag', 'tags'], taglist). -
in_section(self, sectionlist): A shortcut method forself.has_taxonomy(['section', 'sections'], sectionlist). -
get_used_taxonomies(self): Get a list of all known taxonomies that are actually used by items in this MDContentList (i.e. content files). These may be of two types: (1) the standard taxonomies tags, sections, categories and authors; and (2) anything defined as aTAXONOMYin the frontmatter of a page. Returns a list of dicts with the keystaxon,name,name_singularandname_plural. If the taxonomy belongs to the latter group, thenorder,list_url,item_url_patternandpage_idwill be present as well, andname_singular/name_pluralmay be empty. If a standard taxonomy (e.g. tags) has been handled as a content pageTAXONOMY, then the latter type takes precedence (i.e. the standard one is omitted from the list). -
group_by(self, pred, normalize=None, keep_empty=False): Group items in an MDContentList using a given criterion. Parameters:predis a callable receiving a content item and returning a string or a list of strings. For convenience,predmay also be specified as a string and is then interpreted as the value of the namedpagevariable, e.g.category;normalizeis an optional callable that transforms the grouping values, e.g. by truncating and lowercasing them;keep_emptyshould be set to True when the content items whose predicate evaluates to the empty string are to be included in the result, since they otherwise will be omitted. Returns a dict whose keys are strings and whose values areMDContentListinstances. -
taxonomy_info(self, keys, order='count', tostring=None): Returns a list of dicts, where each dict corresponds to the slugified value of any of the keys inkeys. The keys in the dict arename,slug,forms(different forms ofnamethat appear in the result, e.g. upper/lowercase),count, anditems(an MDContentList object).tostring, if present, is a callable that changes non-string and non-list values into strings for the purposes of grouping. Shorthand forms for common taxonomy types are available, namelyget_categories(self, order='name'),get_tags(self, order='name'),get_sections(self, order='name'), andget_authors(self, order='name', tostring=None). These look for both singular and plural forms of the given keys, e.g.['tag', 'tags']forget_tags(). -
page_match(self, match_expr, ordering=None, limit=None): This is actually quite a general matching method but does not require the caller to pass a predicate callable to it, which means that it can be employed in more varied contexts than the general methods described in the last section. Amatch_exprcontains the filtering specification. It will be described further below. Theorderingparameter, if specified, should be eithertitle,slug,urlordate, with an optional-in front to indicate reverse ordering. Thedateoption fororderingmay be followed by the preferred frontmatter date field after a colon, e.g.ordering='-date:modified_date'for a list with the most recently changed files at the top. Thelimit, if specified, obviously indicates the maximum number of pages to return. -
page_match_sql(),get_db(),get_db_columns()– see "Searching/filtering using SQL" below.
A match_expr for page_match() is either a dict or a list of dicts. If it is
a dict, each page in the result set must match each of the attributes specified
in it. If it is a list of dicts, each page in the result set must match at least
one of the dicts (i.e., the returned result set contains the union of all
matches from all dicts in the list). When a string or regular expression match
is being performed in this process, it will be case-insensitive. The supported
attributes (i.e. dict keys) are as follows:
title: A regular expression which will be applied to the page title.slug: A regular expression which will be applied to the slug.id: A string or list of strings (one of) which must match the page id exactly.url: A regular expression which will be applied to the target URL.path: A regular expression which will be applied to the path to the markdown source file (i.e. thesource_file_shortfield).doc: A regular expression which will be applied to the body of the markdown source document.date_range: A list containing two ISO-formatted dates and optionally a date key (DATEby default) - see the description ofin_date_range()above.has_attrs: A list of frontmatter variable names. Matching pages must have a non-empty value for each of them.attrs: A dict where each key is the name of a frontmatter variable and the value is the value of that attribute. If the value is a string, it will be matched case-insensitively. All key-value pairs must match.has_tag,in_section,in_category: The values are lists of tags, sections or categories, respectively, at least one of which must match (case-insensitively). See the methods with these names above.is_post: If set to True, will match if the page is a blog post; if set to False will match if the page is not a blog post.exclude_url: The page with this URL should be omitted from the results (normally the calling page).
Searching/filtering using SQL
An MDContentList has three methods for examining the content using an SQLite
in-memory database:
-
get_db(self): Builds a SQLite database containing a single table,content, whose structure is described below. Returns a connection to this database which can then be worked with using normal sqlite3/DBAPI methods. The database has a locale-sensitive collation calledlocale(which applieslocale.strxfrm) and a custom functioncasefold(which simply applies the Pythoncasefoldstring method). The row factory issqlite3.Row, so row fields can be read using either column names or integer indices. -
get_db_columns(self): Returns a simple list of the columns in thecontenttable. -
page_match_sql(self, where_clause=None, bind=None, order_by=None, limit=None, offset=None, raw_sql=None, raw_result=False, first=False): Eitherwhere_clauseorraw_sqlmust be specified. In either case, ifbindis specified, the bind variables there will be applied to the SQL upon execution. Iforder_by(a string),limitoroffset(integers) are specified, they will be appended to the SQL before executing it against the database connection. The result will be aMDContentListunlessraw_resultis True, in which case it is a cursor object. (Ifraw_resultis False butraw_sqlis supplied, the column list in the SQL select statement must includesource_fileso as to permit the construction of an appropriateMDContentList). Iffirstis True, only the first item from the results is returned (or None, if the results are empty).
The content table constructed by get_db() always contains the columns
source_file, source_file_short, url target, template, MTIME, DATE,
doc, and rendered. In addition, it contains each page metadata field that
appears in any of the entries in the MDContentList in question. These will be
added as columns with the page_ prefix; for instance, the title field will
become page_title.
It should be noted that all page fields added to the table will have to match
the regular expression ^[a-z]\w*$. Thus, any metadata field with
a key that is all uppercase, titlecased, or contains non-word characters
(such as hyphens) will be omitted. Also, field names are case-sensitive in the
raw metadata, but case-insensitive in the database table, so inconsistently
capitalized field names may lead to unexpected results.
A field value that is not either string, integer, float, boolean, date, datetime,
or None, will be serialized using json.dumps() with ensure_ascii set to False
(for easier utf-8 matching). Dates and datetimes are stringified. Booleans will
be represented as 1 or 0.
Sorting
All of these return a new MDContentList object with the entries in the
specified order.
-
sorted_by(self, key, reverse=False, default_val=-1): A general sorting method. Thekeyis thepagevariable to sort on,default_valis the value to assume if there is no such variable present in the entry, whilereverseindicates whether the sort is to be descending (True) or ascending (False, the default). -
sorted_by_date(self, newest_first=True, date_key='DATE'): Sorting by date, newest first by default. The date key to sort on can be specified if desired. -
sorted_by_title(self, reverse=False): Sorting bypage.title, ascending by default.
Pagination
paginate(self, pagesize=5, context=None): Divides theMDContentListinto chunks of sizepagesizeand returns a tuple consisting of the chunks and a list ofpage_urls(one for each page, in order). If an appropriate template context is provided, pages 2 and up will be written to the webroot output directory to destination files whose names are based upon the URL for the first page (and the page number, of course). Without the context, thepage_urlswill be None. It is the responsibility of the calling template to check the_pagevariable for the current page to be rendered (this defaults to 1). Each iteration will get all chunks and must use this variable to limit itself appropriately.
Typical usage of paginate():
<%
posts = MDCONTENT.posts()
chunks, page_urls = posts.paginate(5, context)
curpage = context.get('_page', 1)
%>
% for post in chunks[curpage-1]:
${ show_post(post) }
% endfor
% if len(chunks) > 1:
${ prevnext(len(chunks), curpage, page_urls) }
% endif
Render to an arbitrary file
def write_to(self, dest, context, extra_kwargs=None, template=None): Calls a template with theMDContentListinselfas the value ofCHUNKand write the result to the file named indest. The file is of course relative to the webroot. Any directories are created if necessary. Thetemplateis by default the calling template whileextra_kwargsmay be added if desired.
Typical usage of write_to():
<%
if not CHUNK:
for tag in tags:
tagged = MDCONTENT.has_tag([tag])
if not tagged:
continue # avoid potential infinite loop!
outpath = '/tags/' + slugify(tag) + '/index.html'
tagged.write_to(outpath, context, {'TAG': tag})
%>
% if CHUNK:
${ list_tagged_pages(TAG, CHUNK) }
% else:
${ list_tags() }
% endif