The World Wide Web, or just ``the Web,'' has become the most interesting
and most rapidly expanding part of the Internet, a global network of
computers. Roughly speaking, the Web is a collection of Web pages. Each
Web page is a sequence of words, pictures, movies, audio messages, and
many more things. Most importantly, Web pages also contain links to other
Web pages.
A Web browser enables people to view Web pages. It presents a Web page as a
sequence of words, images, and so on. Some of the words on a page may be
underlined. Clicking on underlined words leads to a new Web page. Most
modern browsers also provide a Web page composer. These are tools that
helps people create collections of Web pages. A composer can, among other
things, search for words or replace one word by another. In short, Web
pages are things that we should be able to represent on computers, and
there are many functions that process Web pages.
To simplify our problem, we consider only Web pages of words and nested Web
pages. One way of understanding such a page is as a sequence of words and
Web pages. This informal description suggests a natural representation of
Web pages as lists of symbols, which represent words, and Web pages, which
represent nested Web pages. After all, we have emphasized before that a
list may contain different kinds of things. Still, when we spell this idea
out as data definition, we get something rather unusual:
A <#63554#><#17388#>Web page<#17388#><#63554#> (<#63555#><#17389#>WP<#17389#><#63555#>) is either
- <#63556#><#17391#>empty<#17391#><#63556#>;
- <#63557#><#17392#>(cons<#17392#>\ <#17393#>s<#17393#>\ <#17394#>wp)<#17394#><#63557#>
where <#63558#><#17395#>s<#17395#><#63558#> is a symbol and <#63559#><#17396#>wp<#17396#><#63559#> is a Web page; or
- <#63560#><#17397#>(cons<#17397#>\ <#17398#>ewp<#17398#>\ <#17399#>wp)<#17399#><#63560#>
where both <#63561#><#17400#>ewp<#17400#><#63561#> and <#63562#><#17401#>wp<#17401#><#63562#> are Web pages.
This data definition differs from that of a list of symbols in
that it has three clauses instead of two and that it has three
self-references instead of one. Of these self-references, the one at the
beginning of a <#63563#><#17404#>cons<#17404#><#63563#>tructed list is the most unusual one. We refer
to such Web pages as <#17405#>immediately embedded<#17405#> Web pages.
Because the data definition is unusual, we construct some examples of Web
pages before we continue. Here is a plain page:
<#17410#>'<#17410#><#17411#>(<#17411#><#17412#>The<#17412#> <#17413#>TeachScheme!<#17413#> <#17414#>Project<#17414#> <#17415#>aims<#17415#>
<#17416#>to<#17416#> <#17417#>improve<#17417#> <#17418#>students'<#17418#> <#17419#>problem-solving<#17419#>
<#17420#>and<#17420#> <#17421#>organization<#17421#> <#17422#>skills.<#17422#> <#17423#>It<#17423#> <#17424#>provides<#17424#>
<#17425#>software<#17425#> <#17426#>and<#17426#> <#17427#>lecture<#17427#> <#17428#>notes<#17428#> <#17429#>as<#17429#> <#17430#>well<#17430#> <#17431#>as<#17431#>
<#17432#>exercises<#17432#> <#17433#>and<#17433#> <#17434#>solutions<#17434#> <#17435#>for<#17435#> <#17436#>teachers.)<#17436#>
It contains nothing but words. Here is a complex page:
<#17444#>'<#17444#><#17445#>(<#17445#><#17446#>The<#17446#> <#17447#>TeachScheme<#17447#> <#17448#>Web<#17448#> <#17449#>Page<#17449#>
<#17450#>Here<#17450#> <#17451#>you<#17451#> <#17452#>can<#17452#> <#17453#>find:<#17453#>
<#17454#>(LectureNotes<#17454#> <#17455#>for<#17455#> <#17456#>Teachers)<#17456#>
<#17457#>(Guidance<#17457#> <#17458#>for<#17458#> <#17459#>(DrScheme:<#17459#> <#17460#>a<#17460#> <#17461#>Scheme<#17461#> <#17462#>programming<#17462#> <#17463#>environment))<#17463#>
<#17464#>(Exercise<#17464#> <#17465#>Sets)<#17465#>
<#17466#>(Solutions<#17466#> <#17467#>for<#17467#> <#17468#>Exercises)<#17468#>
<#17469#>For<#17469#> <#17470#>further<#17470#> <#17471#>information,<#17471#> <#17472#>write<#17472#> <#17473#>to<#17473#> <#17474#>scheme@<#17474#><#17475#>cs)<#17475#>
The immediately embedded pages start with parentheses and the symbols
<#63564#><#17479#>'<#17479#><#17480#>LectureNotes<#17480#><#63564#>, <#63565#><#17481#>'<#17481#><#17482#>Guidance<#17482#><#63565#>, <#63566#><#17483#>'<#17483#><#17484#>Exercises<#17484#><#63566#>, and
<#63567#><#17485#>'<#17485#><#17486#>Solutions<#17486#><#63567#>. The second embedded Web page contains another embedded
page, which starts with the word <#63568#><#17487#>'<#17487#><#17488#>DrScheme<#17488#><#63568#>. We say this page is
<#17489#>embedded<#17489#> with respect to the entire page.
Let's develop the function <#63569#><#17490#>size<#17490#><#63569#>, which consumes a Web page and
produces the number of words that it and all of its embedded pages contain:
<#71075#>;; <#63570#><#17495#>size<#17495#> <#17496#>:<#17496#> <#17497#>WP<#17497#> <#17498#><#17498#><#17499#>-;SPMgt;<#17499#><#17500#><#17500#> <#17501#>number<#17501#><#63570#><#71075#>
<#71076#>;; to count the number of symbols that occur in <#63571#><#17502#>a-wp<#17502#><#63571#><#71076#>
<#17503#>(define<#17503#> <#17504#>(size<#17504#> <#17505#>a-wp)<#17505#> <#17506#>...)<#17506#>
The two Web pages above suggest two good examples, but they are too
complex. Here are three simpler examples, one per clause in the data
definition:
<#17514#>(size<#17514#> <#17515#>empty)<#17515#>
<#17516#>=<#17516#> <#17517#>0<#17517#>
<#17525#>(size<#17525#> <#17526#>(cons<#17526#> <#17527#>'<#17527#><#17528#>One<#17528#> <#17529#>empty))<#17529#>
<#17530#>=<#17530#> <#17531#>1<#17531#>
<#17539#>(size<#17539#> <#17540#>(cons<#17540#> <#17541#>(cons<#17541#> <#17542#>'<#17542#><#17543#>One<#17543#> <#17544#>empty)<#17544#> <#17545#>empty))<#17545#>
<#17546#>=<#17546#> <#17547#>1<#17547#>
The first two examples are obvious. The third one deserves a short
explanation. It is a Web page that contains one immediately embedded Web
page, and nothing else. The embedded Web page is the one of the second
example, and it contains the one and only symbol of the third example.
To develop the template for <#63572#><#17551#>size<#17551#><#63572#>, let's carefully step through the
design recipe. The shape of the data definition suggests that we need three
<#63573#><#17552#>cond<#17552#><#63573#>-clauses: one for the <#63574#><#17553#>empty<#17553#><#63574#> page, one for a page that
starts with a symbol, and one for a page that starts with an embedded Web
page. While the first condition is the familiar test for <#63575#><#17554#>empty<#17554#><#63575#>,
the second and third need closer inspection because both clauses in the
data definition use <#63576#><#17555#>cons<#17555#><#63576#> and a simple <#63577#><#17556#>cons?<#17556#><#63577#> won't
distinguish between the two forms of data.
If the page is not <#63578#><#17557#>empty<#17557#><#63578#>, it is certainly <#63579#><#17558#>cons<#17558#><#63579#>tructed,
and the distinguishing feature is the first item on the list. In other
words, the second condition must use a predicate that tests the first item
on <#63580#><#17559#>a-wp<#17559#><#63580#>:
<#71077#>;; <#63581#><#17564#>size<#17564#> <#17565#>:<#17565#> <#17566#>WP<#17566#> <#17567#><#17567#><#17568#>-;SPMgt;<#17568#><#17569#><#17569#> <#17570#>number<#17570#><#63581#><#71077#>
<#17571#>;; to count the number of symbols that occur in a-wp<#17571#>
<#17572#>(d<#17572#><#17573#>efine<#17573#> <#17574#>(size<#17574#> <#17575#>a-wp)<#17575#>
<#17576#>(c<#17576#><#17577#>ond<#17577#>
<#17578#>[<#17578#><#17579#>(empty?<#17579#> <#17580#>a-wp)<#17580#> <#17581#>...]<#17581#>
<#17582#>[<#17582#><#17583#>(symbol?<#17583#> <#17584#>(first<#17584#> <#17585#>a-wp))<#17585#> <#17586#>...<#17586#> <#17587#>(first<#17587#> <#17588#>a-wp)<#17588#> <#17589#>...<#17589#> <#17590#>(size<#17590#> <#17591#>(rest<#17591#> <#17592#>a-wp))<#17592#> <#17593#>...]<#17593#>
<#17594#>[<#17594#><#17595#>else<#17595#> <#17596#>...<#17596#> <#17597#>(size<#17597#> <#17598#>(first<#17598#> <#17599#>a-wp))<#17599#> <#17600#>...<#17600#> <#17601#>(size<#17601#> <#17602#>(rest<#17602#> <#17603#>a-wp))<#17603#> <#17604#>...]<#17604#><#17605#>))<#17605#>
The rest of the template is as usual. The second and third <#63582#><#17609#>cond<#17609#><#63582#>
clause contain selector expressions for the first item and the rest of the
list. Because <#63583#><#17610#>(rest<#17610#>\ <#17611#>a-wp)<#17611#><#63583#> is always a Web page and because
<#63584#><#17612#>(first<#17612#>\ <#17613#>a-wp)<#17613#><#63584#> is one in the third case, we also add a recursive
call to size for these selector expressions.
Using the examples and the template, we are ready to design <#63585#><#17614#>size<#17614#><#63585#>:
see figure~#figsize#17615>. The differences between the definition and the
template are minimal, which shows again how much of a function we can design
by merely thinking systematically about the data definition for its inputs.
<#71078#>;; <#63586#><#17620#>size<#17620#> <#17621#>:<#17621#> <#17622#>WP<#17622#> <#17623#><#17623#><#17624#>-;SPMgt;<#17624#><#17625#><#17625#> <#17626#>number<#17626#><#63586#><#71078#>
<#17627#>;; to count the number of symbols that occur in a-wp<#17627#>
<#17628#>(d<#17628#><#17629#>efine<#17629#> <#17630#>(size<#17630#> <#17631#>a-wp)<#17631#>
<#17632#>(c<#17632#><#17633#>ond<#17633#>
<#17634#>[<#17634#><#17635#>(empty?<#17635#> <#17636#>a-wp)<#17636#> <#17637#>0]<#17637#>
<#17638#>[<#17638#><#17639#>(symbol?<#17639#> <#17640#>(first<#17640#> <#17641#>a-wp))<#17641#> <#17642#>(+<#17642#> <#17643#>1<#17643#> <#17644#>(size<#17644#> <#17645#>(rest<#17645#> <#17646#>a-wp)))]<#17646#>
<#17647#>[<#17647#><#17648#>else<#17648#> <#17649#>(+<#17649#> <#17650#>(size<#17650#> <#17651#>(first<#17651#> <#17652#>a-wp))<#17652#> <#17653#>(size<#17653#> <#17654#>(rest<#17654#> <#17655#>a-wp)))]<#17655#><#17656#>))<#17656#>
<#63587#>Figure: The definition of <#17660#>size<#17660#> for Web pages<#63587#>
<#17664#>Exercise 14.3.1<#17664#>
Briefly explain how to define <#63588#><#17666#>size<#17666#><#63588#> using its template and the
examples. Test <#63589#><#17667#>size<#17667#><#63589#> using the examples from above.
<#17668#>Exercise 14.3.2<#17668#>
Develop the function <#63590#><#17670#>occurs1<#17670#><#63590#>. The function consumes a Web page and a
symbol. It produces the number of times the symbol occurs in the Web page,
ignoring the nested Web pages.
Develop the function <#63591#><#17671#>occurs2<#17671#><#63591#>. It is like <#63592#><#17672#>occurs1<#17672#><#63592#>, but
counts <#17673#>all<#17673#> occurrences of the symbol, including in embedded Web
pages.~ Solution<#63593#><#63593#>
<#17679#>Exercise 14.3.3<#17679#>
Develop the function <#63594#><#17681#>replace<#17681#><#63594#>. The function consumes two symbols,
<#63595#><#17682#>new<#17682#><#63595#> and <#63596#><#17683#>old<#17683#><#63596#>, and a Web page, <#63597#><#17684#>a-wp<#17684#><#63597#>. It
produces a page that is structurally identical to <#63598#><#17685#>a-wp<#17685#><#63598#> but with
all occurrences of <#63599#><#17686#>old<#17686#><#63599#> replaced by <#63600#><#17687#>new<#17687#><#63600#>.~ Solution<#63601#><#63601#>
<#17693#>Exercise 14.3.4<#17693#>
People do not like deep Web trees because they require too many page
switches to reach useful information. For that reason a Web page designer
may also want to measure the depth of a page. A page containing only
symbols has depth <#63602#><#17695#>0<#17695#><#63602#>. A page with an immediately embedded page has
the depth of the embedded page plus <#63603#><#17696#>1<#17696#><#63603#>. If a page has several
immediately embedded Web pages, its depth is the maximum of the depths of
embedded Web pages plus <#63604#><#17697#>1<#17697#><#63604#>. Develop <#63605#><#17698#>depth<#17698#><#63605#>, which consumes
a Web page and computes its depth.~ Solution<#63606#><#63606#>