mirror of
https://github.com/jzillmann/pdf-to-markdown.git
synced 2025-02-06 12:59:12 +01:00
Add Markdown comparison tests
- Convert the `example PDFs` with the old `pdf-to-markdown` and write them to text files - Compare the text files with the conversion of the current code - Next: - Improve the current code to match good conversions of the old code - Adapt the text files in case the current conversion is better than the old - Current tests are breaking
This commit is contained in:
parent
c531dba632
commit
78db114632
74286
examples/Achieving-The-Paris-Climate-Agreement.md
Normal file
74286
examples/Achieving-The-Paris-Climate-Agreement.md
Normal file
File diff suppressed because it is too large
Load Diff
9212
examples/Adventures-Of-Sherlock-Holmes.md
Normal file
9212
examples/Adventures-Of-Sherlock-Holmes.md
Normal file
File diff suppressed because it is too large
Load Diff
2965
examples/Alice-In-Wonderland.md
Normal file
2965
examples/Alice-In-Wonderland.md
Normal file
File diff suppressed because it is too large
Load Diff
1572
examples/CC-NC_Leitfaden.md
Normal file
1572
examples/CC-NC_Leitfaden.md
Normal file
File diff suppressed because it is too large
Load Diff
340
examples/CC_License_Agreement_of_siMPle.md
Normal file
340
examples/CC_License_Agreement_of_siMPle.md
Normal file
@ -0,0 +1,340 @@
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
## Creative Commons Attribution-ShareAlike 4.
|
||||
|
||||
# International Public License Agreement of siMPle –
|
||||
|
||||
# Software for the automated detection of microplastic
|
||||
|
||||
```
|
||||
By exercising the Licensed Rights (defined below), You accept and agree to be bound by
|
||||
the terms and conditions of this Creative Commons Attribution-ShareAlike 4.
|
||||
International Public License ("Public License"). To the extent this Public License may be
|
||||
interpreted as a contract, You are granted the Licensed Rights in consideration of Your
|
||||
acceptance of these terms and conditions, and the Licensor grants You such rights in
|
||||
consideration of benefits the Licensor receives from making the Licensed Material
|
||||
available under these terms and conditions.
|
||||
```
|
||||
```
|
||||
Section 1 – Definitions.
|
||||
```
|
||||
a. Adapted Material means material subject to Copyright and Similar Rights that is derived
|
||||
from or based upon the Licensed Material and in which the Licensed Material is translated,
|
||||
altered, arranged, transformed, or otherwise modified in a manner requiring permission
|
||||
under the Copyright and Similar Rights held by the Licensor. For purposes of this Public
|
||||
License, where the Licensed Material is a musical work, performance, or sound recording,
|
||||
Adapted Material is always produced where the Licensed Material is synched in timed
|
||||
relation with a moving image.
|
||||
|
||||
b. Adapter's License means the license You apply to Your Copyright and Similar Rights in
|
||||
Your contributions to Adapted Material in accordance with the terms and conditions of
|
||||
this Public License.
|
||||
|
||||
c. BY-SA Compatible License means a license listed
|
||||
at creativecommons.org/compatiblelicenses, approved by Creative Commons as
|
||||
essentially the equivalent of this Public License.
|
||||
|
||||
d. Copyright and Similar Rights means copyright and/or similar rights closely related to
|
||||
copyright including, without limitation, performance, broadcast, sound recording, and Sui
|
||||
Generis Database Rights, without regard to how the rights are labeled or categorized. For
|
||||
purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright
|
||||
and Similar Rights.
|
||||
|
||||
e. Effective Technological Measures means those measures that, in the absence of proper
|
||||
authority, may not be circumvented under laws fulfilling obligations under Article 11 of
|
||||
the WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international
|
||||
agreements.
|
||||
|
||||
f. Exceptions and Limitations means fair use, fair dealing, and/or any other exception or
|
||||
limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material.
|
||||
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
g. License Elements means the license attributes listed in the name of a Creative Commons
|
||||
Public License. The License Elements of this Public License are Attribution and ShareAlike.
|
||||
|
||||
h. Licensed Material means the artistic or literary work, database, or other material to which
|
||||
the Licensor applied this Public License.
|
||||
|
||||
i. Licensed Rights means the rights granted to You subject to the terms and conditions of
|
||||
this Public License, which are limited to all Copyright and Similar Rights that apply to Your
|
||||
use of the Licensed Material and that the Licensor has authority to license.
|
||||
|
||||
j. Licensor means the individual(s) or entity(ies) granting rights under this Public License.
|
||||
|
||||
k. Share means to provide material to the public by any means or process that requires
|
||||
permission under the Licensed Rights, such as reproduction, public display, public
|
||||
performance, distribution, dissemination, communication, or importation, and to make
|
||||
material available to the public including in ways that members of the public may access
|
||||
the material from a place and at a time individually chosen by them.
|
||||
|
||||
l. Sui Generis Database Rights means rights other than copyright resulting from Directive
|
||||
96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal
|
||||
protection of databases, as amended and/or succeeded, as well as other essentially
|
||||
equivalent rights anywhere in the world.
|
||||
|
||||
m. You means the individual or entity exercising the Licensed Rights under this Public
|
||||
License. Your has a corresponding meaning.
|
||||
|
||||
```
|
||||
Section 2 – Scope.
|
||||
```
|
||||
a. License grant.
|
||||
|
||||
1. Subject to the terms and conditions of this Public License, the Licensor hereby grants You
|
||||
a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to
|
||||
exercise the Licensed Rights in the Licensed Material to:
|
||||
|
||||
A. reproduce and Share the Licensed Material, in whole or in part; and
|
||||
|
||||
B. produce, reproduce, and Share Adapted Material.
|
||||
|
||||
2. Exceptions and Limitations. For the avoidance of doubt, where Exceptions and Limitations
|
||||
apply to Your use, this Public License does not apply, and You do not need to comply with
|
||||
its terms and conditions.
|
||||
3. Term. The term of this Public License is specified in Section 6(a).
|
||||
4. Media and formats; technical modifications allowed. The Licensor authorizes You to
|
||||
exercise the Licensed Rights in all media and formats whether now known or hereafter
|
||||
created, and to make technical modifications necessary to do so. The Licensor waives
|
||||
and/or agrees not to assert any right or authority to forbid You from making technical
|
||||
modifications necessary to exercise the Licensed Rights, including technical modifications
|
||||
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
```
|
||||
necessary to circumvent Effective Technological Measures. For purposes of this Public
|
||||
License, simply making modifications authorized by this Section 2(a)(4) never produces
|
||||
Adapted Material.
|
||||
```
|
||||
5. Downstream recipients.
|
||||
|
||||
A. Offer from the Licensor – Licensed Material. Every recipient of the Licensed Material
|
||||
automatically receives an offer from the Licensor to exercise the Licensed Rights under the
|
||||
terms and conditions of this Public License.
|
||||
|
||||
B. Additional offer from the Licensor – Adapted Material. Every recipient of Adapted Material
|
||||
from You automatically receives an offer from the Licensor to exercise the Licensed Rights
|
||||
in the Adapted Material under the conditions of the Adapter’s License You apply.
|
||||
|
||||
C. No downstream restrictions. You may not offer or impose any additional or different terms
|
||||
or conditions on, or apply any Effective Technological Measures to, the Licensed Material
|
||||
if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed
|
||||
Material.
|
||||
|
||||
6. No endorsement. Nothing in this Public License constitutes or may be construed as
|
||||
permission to assert or imply that You are, or that Your use of the Licensed Material is,
|
||||
connected with, or sponsored, endorsed, or granted official status by, the Licensor or
|
||||
others designated to receive attribution as provided in Section 3(a)(1)(A)(i).
|
||||
|
||||
b. Other rights.
|
||||
|
||||
1. Moral rights, such as the right of integrity, are not licensed under this Public License, nor
|
||||
are publicity, privacy, and/or other similar personality rights; however, to the extent
|
||||
possible, the Licensor waives and/or agrees not to assert any such rights held by the
|
||||
Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but
|
||||
not otherwise.
|
||||
2. Patent and trademark rights are not licensed under this Public License.
|
||||
3. To the extent possible, the Licensor waives any right to collect royalties from You for the
|
||||
exercise of the Licensed Rights, whether directly or through a collecting society under any
|
||||
voluntary or waivable statutory or compulsory licensing scheme. In all other cases the
|
||||
Licensor expressly reserves any right to collect such royalties.
|
||||
|
||||
```
|
||||
Section 3 – License Conditions.
|
||||
```
|
||||
```
|
||||
Your exercise of the Licensed Rights is expressly made subject to the following conditions.
|
||||
```
|
||||
a. Attribution.
|
||||
|
||||
1. If You Share the Licensed Material (including in modified form), You must:
|
||||
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
```
|
||||
A. retain the following if it is supplied by the Licensor with the Licensed Material:
|
||||
i. identification of the creator(s) of the Licensed Material and any others designated to
|
||||
receive attribution, in any reasonable manner requested by the Licensor (including by
|
||||
pseudonym if designated);
|
||||
ii. a copyright notice;
|
||||
```
|
||||
iii. a notice that refers to this Public License;
|
||||
|
||||
iv. a notice that refers to the disclaimer of warranties;
|
||||
|
||||
```
|
||||
v. a URI or hyperlink to the Licensed Material to the extent reasonably practicable
|
||||
```
|
||||
```
|
||||
Citing siMPle in academic papers:
|
||||
```
|
||||
- Primpke, S., A. Dias, P., Gerdts, G., Anal. Methods 11, 2138 – 2147. (2019)
|
||||
- Liu, F., Olesen, K.B., Borregaard, A.R., Vollertsen, J., Sci. Total Environ. 671. (2019)
|
||||
- Raman database: Cabernard, L.; Roscher, L.; Lorenz, C.; Gerdts, G.; Primpke, S., Environmental Science
|
||||
& Technology 52 (22), 13279- 13288 (2018)
|
||||
|
||||
```
|
||||
B. indicate if You modified the Licensed Material and retain an indication of any previous
|
||||
modifications; and
|
||||
C. indicate the Licensed Material is licensed under this Public License, and include the text
|
||||
of, or the URI or hyperlink to, this Public License.
|
||||
```
|
||||
2. You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the
|
||||
medium, means, and context in which You Share the Licensed Material. For example, it
|
||||
may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource
|
||||
that includes the required information.
|
||||
3. If requested by the Licensor, You must remove any of the information required by
|
||||
Section 3(a)(1)(A) to the extent reasonably practicable.
|
||||
b. ShareAlike.
|
||||
|
||||
```
|
||||
In addition to the conditions in Section 3(a), if You Share Adapted Material You produce,
|
||||
the following conditions also apply.
|
||||
```
|
||||
1. The Adapter’s License You apply must be a Creative Commons license with the same
|
||||
License Elements, this version or later, or a BY-SA Compatible License.
|
||||
2. You must include the text of, or the URI or hyperlink to, the Adapter's License You apply.
|
||||
You may satisfy this condition in any reasonable manner based on the medium, means,
|
||||
and context in which You Share Adapted Material.
|
||||
3. You may not offer or impose any additional or different terms or conditions on, or apply
|
||||
any Effective Technological Measures to, Adapted Material that restrict exercise of the
|
||||
rights granted under the Adapter's License You apply.
|
||||
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
```
|
||||
Section 4 – Sui Generis Database Rights.
|
||||
```
|
||||
```
|
||||
Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of
|
||||
the Licensed Material:
|
||||
```
|
||||
a. for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse,
|
||||
reproduce, and Share all or a substantial portion of the contents of the database;
|
||||
|
||||
b. if You include all or a substantial portion of the database contents in a database in which
|
||||
You have Sui Generis Database Rights, then the database in which You have Sui Generis
|
||||
Database Rights (but not its individual contents) is Adapted Material, including for
|
||||
purposes of Section 3(b); and
|
||||
|
||||
c. You must comply with the conditions in Section 3(a) if You Share all or a substantial
|
||||
portion of the contents of the database.
|
||||
For the avoidance of doubt, this Section 4 supplements and does not replace Your
|
||||
obligations under this Public License where the Licensed Rights include other Copyright
|
||||
and Similar Rights.
|
||||
|
||||
```
|
||||
Section 5 – Disclaimer of Warranties and Limitation of Liability.
|
||||
```
|
||||
a. Unless otherwise separately undertaken by the Licensor, to the extent possible, the
|
||||
Licensor offers the Licensed Material as-is and as-available, and makes no representations
|
||||
or warranties of any kind concerning the Licensed Material, whether express, implied,
|
||||
statutory, or other. This includes, without limitation, warranties of title, merchantability,
|
||||
fitness for a particular purpose, non-infringement, absence of latent or other defects,
|
||||
accuracy, or the presence or absence of errors, whether or not known or discoverable.
|
||||
Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not
|
||||
apply to You.
|
||||
|
||||
b. To the extent possible, in no event will the Licensor be liable to You on any legal theory
|
||||
(including, without limitation, negligence) or otherwise for any direct, special, indirect,
|
||||
incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or
|
||||
damages arising out of this Public License or use of the Licensed Material, even if the
|
||||
Licensor has been advised of the possibility of such losses, costs, expenses, or damages.
|
||||
Where a limitation of liability is not allowed in full or in part, this limitation may not apply
|
||||
to You.
|
||||
|
||||
c. The disclaimer of warranties and limitation of liability provided above shall be interpreted
|
||||
in a manner that, to the extent possible, most closely approximates an absolute disclaimer
|
||||
and waiver of all liability.
|
||||
|
||||
```
|
||||
Section 6 – Term and Termination.
|
||||
```
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
a. This Public License applies for the term of the Copyright and Similar Rights licensed here.
|
||||
However, if You fail to comply with this Public License, then Your rights under this Public
|
||||
License terminate automatically.
|
||||
|
||||
b. Where Your right to use the Licensed Material has terminated under Section 6(a), it
|
||||
reinstates:
|
||||
|
||||
1. automatically as of the date the violation is cured, provided it is cured within 30 days of
|
||||
Your discovery of the violation; or
|
||||
2. upon express reinstatement by the Licensor.
|
||||
|
||||
```
|
||||
For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may
|
||||
have to seek remedies for Your violations of this Public License.
|
||||
```
|
||||
c. For the avoidance of doubt, the Licensor may also offer the Licensed Material under
|
||||
separate terms or conditions or stop distributing the Licensed Material at any time;
|
||||
however, doing so will not terminate this Public License.
|
||||
|
||||
d. Sections 1 , 5 , 6 , 7 , and 8 survive termination of this Public License.
|
||||
|
||||
```
|
||||
Section 7 – Other Terms and Conditions.
|
||||
```
|
||||
a. The Licensor shall not be bound by any additional or different terms or conditions
|
||||
communicated by You unless expressly agreed.
|
||||
|
||||
b. Any arrangements, understandings, or agreements regarding the Licensed Material not
|
||||
stated herein are separate from and independent of the terms and conditions of this
|
||||
Public License.
|
||||
|
||||
```
|
||||
Section 8 – Interpretation.
|
||||
```
|
||||
a. For the avoidance of doubt, this Public License does not, and shall not be interpreted to,
|
||||
reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could
|
||||
lawfully be made without permission under this Public License.
|
||||
|
||||
b. To the extent possible, if any provision of this Public License is deemed unenforceable, it
|
||||
shall be automatically reformed to the minimum extent necessary to make it enforceable.
|
||||
If the provision cannot be reformed, it shall be severed from this Public License without
|
||||
affecting the enforceability of the remaining terms and conditions.
|
||||
|
||||
c. No term or condition of this Public License will be waived and no failure to comply
|
||||
consented to unless expressly agreed to by the Licensor.
|
||||
|
||||
|
||||
```
|
||||
Developed by Aalborg University, Denmark and Alfred Wegener Institute, Germany
|
||||
```
|
||||
d. Nothing in this Public License constitutes or may be interpreted as a limitation upon, or
|
||||
waiver of, any privileges and immunities that apply to the Licensor or You, including from
|
||||
the legal processes of any jurisdiction or authority.
|
||||
|
||||
```
|
||||
Creative Commons is not a party to its public licenses. Notwithstanding, Creative
|
||||
Commons may elect to apply one of its public licenses to material it publishes and in those
|
||||
instances will be considered the “Licensor.” The text of the Creative Commons public
|
||||
```
|
||||
### licenses is dedicated to the public domain under the CC0 Public Domain Dedication.
|
||||
|
||||
```
|
||||
Except for the limited purpose of indicating that material is shared under a Creative
|
||||
Commons public license or as otherwise permitted by the Creative Commons policies
|
||||
published at creativecommons.org/policies, Creative Commons does not authorize the
|
||||
use of the trademark “Creative Commons” or any other trademark or logo of Creative
|
||||
Commons without its prior written consent including, without limitation, in connection
|
||||
with any unauthorized modifications to any of its public licenses or any other
|
||||
arrangements, understandings, or agreements concerning use of licensed material. For
|
||||
the avoidance of doubt, this paragraph does not form part of the public licenses.
|
||||
```
|
||||
```
|
||||
Creative Commons may be contacted at creativecommons.org.
|
||||
```
|
||||
|
3133
examples/Closed-Syllables.md
Normal file
3133
examples/Closed-Syllables.md
Normal file
File diff suppressed because it is too large
Load Diff
219
examples/ExamplePdf.md
Normal file
219
examples/ExamplePdf.md
Normal file
@ -0,0 +1,219 @@
|
||||
# Mega Überschrift
|
||||
|
||||
## 2te Überschrift
|
||||
|
||||
```
|
||||
Dies ist eine Test-PDF^1.
|
||||
Für’s Testen des Markdown Parsers.
|
||||
```
|
||||
(^1) In Deutsch.
|
||||
|
||||
|
||||
|
||||
## Paragraphen
|
||||
|
||||
Das ist ein Paragraph. Ein einfacher Paragraph mit Schrift in Normalgröße^2. Damit wir _sehen_ wie
|
||||
sich Zeilenumbrüche verhalten, schreiben wir einfach ein bisschen mehr. So, dass sieht ja jetzt
|
||||
schon gut aus!
|
||||
Ohne Zwischenzeile, neu angesetzt.
|
||||
|
||||
Mit Zwischenzeile, neu angesetzt.
|
||||
|
||||
Und nachfolgend ein etwas längerer Tex:
|
||||
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi laoreet diam nibh, sit amet bibendum
|
||||
metus tristique vel. Sed neque nulla, lacinia sit amet ex ut, ultrices dictum turpis. Praesent fringilla,
|
||||
lacus nec lobortis placerat, lorem ipsum convallis nisl, sit amet imperdiet erat arcu id arcu. Aenean
|
||||
accumsan risus in purus facilisis interdum. Aliquam tincidunt condimentum est, scelerisque
|
||||
venenatis orci. Fusce neque nibh, dapibus et volutpat sit amet, consectetur ac quam. Sed pharetra
|
||||
faucibus arcu, at interdum dui ornare ut. Aliquam sodales, magna et euismod congue, ipsum diam
|
||||
tempus sapien, vel aliquet tortor dolor ut purus. Aenean aliquet ut erat vitae dictum. Fusce eget
|
||||
ultrices magna. Sed egestas mi nec rutrum iaculis. Phasellus condimentum^3 , urna sit amet sodales
|
||||
accumsan, lacus risus cursus ipsum, et rhoncus ligula mi et nibh. In consequat a risus a
|
||||
accumsan. Pellentesque nec lacus sodales eros laoreet pretium non ac erat.
|
||||
|
||||
Und jetzt ein kleiner Text im block-format. Das erzeugt schöne doppelte Leerzeichen zwischen
|
||||
Wörtern. Wenn Markdown zu HTML gerendert wird, fällt das zwar nicht mehr auf. Aber in der puren
|
||||
Text-Version ist es schon stark sichtbar!
|
||||
|
||||
Und jetzt^4 einfach nochmal Text^5 um die Fussnoten in zweistellige Bereiche^6 vorranzutreiben!
|
||||
|
||||
(^2) Was immer auch ‘normal’ ist...
|
||||
(^3) Nicht zu verwechseln mit ‘condimenta’. Meine Lateinkenntnisse sind zwar schon so alt das ich
|
||||
überhaupt keine Ahnung hab, aber zumindest hab ich jetzt eine mehrzeilige Fussnote!
|
||||
(^4) Hier & Jetzt!
|
||||
(^5) Nicht viel mehr als ein Satz.
|
||||
(^6) Weil dann wird's komplizierter!
|
||||
|
||||
|
||||
## Schriftschnitt
|
||||
|
||||
Etwas _kursiv_ ist auch nicht schlecht. **Fett** ist auch interessant. Und was ist mit
|
||||
**_FETTUNDKURSIV_**?
|
||||
|
||||
Interessant wird's wenn _mehrere Wörter hintereinanderweg formatiert_ sind. Und _dann noch über
|
||||
Zeilenbrüche hinweg_.
|
||||
|
||||
Fies könnte es werden mit _abwechselnden_ **Formaten**. Und das ganze dann noch _über_ **mehrere**
|
||||
_Zeilen_ hinweg.
|
||||
|
||||
Und weil es so schön ist, fangen wir jetzt in dieser Zeile mit einem Schriftschnitt, nämlich _kursiv an.
|
||||
Ziehen es über die gesamte zweite Zeile durch. Ist nicht ganz leicht, aber schaffen wir! Und lassen
|
||||
es dann Mitte_ der 3ten **Zeile** ausklingen.
|
||||
|
||||
Und nun _kursiv_ Und **Fett** Zusammen _Ge_ **Mixt**. Ohne Leerzeichen...
|
||||
|
||||
_Eine_ Zeile, die mit kursiv anfing und endet mit **fett.**
|
||||
|
||||
Beende die Zeile mit **fett.**
|
||||
_Kursiv_ ist dann die nachfolgende!
|
||||
|
||||
Eine Liste mit unterschiedlich formatierten Wörtern
|
||||
|
||||
- Etwas _Kursiv_
|
||||
- Etwas **Fett**
|
||||
- Etwas Unterstrichen^7
|
||||
- Etwas Durchgestrichen
|
||||
- Und noch ein Link: [http://pdf2md.morethan.io](http://pdf2md.morethan.io)
|
||||
|
||||
Ne Zeile die _kursiv endet,
|
||||
und in ner_ (fast) _komplett lasziven, eh, kursiven Zeile endet._
|
||||
|
||||
**Etwas eher unwahrscheinliches. Zeile komplette fett.**
|
||||
_Zeile komplett kursiv._
|
||||
**Und wieder fett.**
|
||||
_Und_ **gemixt**.
|
||||
|
||||
_Ein kompletter Absatz in kursiver Schriftform. Was will ich damit erreichen? Ich will es sehen,
|
||||
einfach nur sehen! Gibt sicher noch andere sehenswerte Sachen im Leben, aber JETZT,
|
||||
interessiert mich ein kursiver Text Block! ;)_
|
||||
|
||||
_Und ein folgender Absatz, auch kursiv!_
|
||||
|
||||
_Und ein kursiver Setzt der einen eingeschlossen Link, nämlich [http://pdf2md.morethan.io,](http://pdf2md.morethan.io,) hat._
|
||||
|
||||
(^7) Fussnote in einer Liste
|
||||
|
||||
|
||||
## Listen
|
||||
|
||||
Nun eine Liste mit dash’s:
|
||||
|
||||
- Eintrag 1
|
||||
- Eintrag 2, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche du Zeile. Na
|
||||
los. Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden kann es ja nicht.
|
||||
Also los!
|
||||
- Eintrag 3
|
||||
|
||||
Und Untergruppen:
|
||||
|
||||
- Eintrag 1
|
||||
- Sub Eintrag 1.1, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche du
|
||||
Zeile. Na los. Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden kann
|
||||
es ja nicht. Also los!
|
||||
- Sub Eintrag 1.
|
||||
- Eintrag 2
|
||||
- Sub Eintrag 2.
|
||||
|
||||
Und eine mit bullet’s:
|
||||
|
||||
- Eintrage 1
|
||||
- Eintrage 2
|
||||
|
||||
Gemixt:
|
||||
|
||||
- Eintrage 1
|
||||
- Eintrage 2
|
||||
|
||||
Nummerierte Liste:
|
||||
|
||||
1. Eins
|
||||
2. Zwei, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche du Zeile. Na los.
|
||||
Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden kann es ja nicht. Also
|
||||
los!
|
||||
3. Drei
|
||||
4. Vier. Und auch hier wieder ein etwas längerer Text, so dass der Eintrag über mehrere Zeilen
|
||||
geht!
|
||||
|
||||
Zentrierte Liste:
|
||||
|
||||
- Eintrag 1
|
||||
- Eintrag 2, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche du Zeile.
|
||||
Na los. Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden kann es
|
||||
ja nicht. Also los!
|
||||
- Eintrag 3
|
||||
|
||||
Zwei aufeinander folgende Listen:
|
||||
|
||||
- Erste 1
|
||||
- Erste 2
|
||||
- Zwote 1
|
||||
- Zwote 2
|
||||
|
||||
|
||||
Liste mit drei Levels:
|
||||
|
||||
- Erster Level 1
|
||||
- Zwoter Level 1.1, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche du
|
||||
Zeile. Na los. Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden kann
|
||||
es ja nicht. Also los!
|
||||
- 3ter Level 1.1.
|
||||
- 3ter Level 1.1.2, aber mit so langem Text, das er umbricht. Wirklich, wirklich lang. Breche
|
||||
du Zeile. Na los. Na endlich. Vielleicht sollt ich das auf 3 Zeilen erweitern? Na ja, schaden
|
||||
kann es ja nicht. Also los!
|
||||
- Zwoter Level 1.
|
||||
- Zwoter Level 1.
|
||||
- 3ter Level 1.3.
|
||||
- Erster Level 2
|
||||
|
||||
Und nun nummeriert mit un-nummerierten Sub-Leveln:
|
||||
|
||||
1. Eintrag 1
|
||||
- Eintrag 1.
|
||||
- Eintrag 1.
|
||||
2. Eintrag 2
|
||||
|
||||
Und jetzt eine Liste, die übergangslos aus einem zwei-zeiligen Paragraphen folgt. Mal sehen ob
|
||||
der Parser das sauber trennen kann:
|
||||
|
||||
- Eintrag 1
|
||||
- Eintrag 2
|
||||
Und danach kommt auch gleicht was.
|
||||
|
||||
|
||||
## Quotes & Spezielle Einschübe^8
|
||||
|
||||
Das hier ist wieder ein normaler Absatz. Das interessante ist der nachfolgende Teil, der
|
||||
eingeschoben ist, gewöhnlicher Weise sowas wie ein Zitat, oder Code, oder sonst was:
|
||||
|
||||
```
|
||||
Wenn ein chaotischer Schreibtisch eine chaotische Denkweise widerspiegelt, welche Denkweise
|
||||
spiegelt dann ein leerer Schreibtisch wider? - Albert Einstein
|
||||
```
|
||||
So, das war ja schonmal ein guter Anfang. Hier noch ein Einzeiler:
|
||||
|
||||
```
|
||||
Phantasie ist wichtiger als Wissen , denn Wissen^9 ist begrenzt. - Albert Einstein^10
|
||||
```
|
||||
Und nun mehrere Quotes hintereinander:
|
||||
|
||||
```
|
||||
Die größte Macht hat das richtige Wort zur richtigen Zeit. - Mark Twain
|
||||
```
|
||||
```
|
||||
Der Kuss ist ein liebenswerter Trick der Natur, ein Gespräch zu unterbrechen, wenn Worte
|
||||
überflüssig werden. - Ingrid Bergman
|
||||
```
|
||||
```
|
||||
Das Schicksal wird schon seine Gründe haben. - Voltaire
|
||||
```
|
||||
### Heading 2
|
||||
|
||||
abc
|
||||
|
||||
### Heading 2 II
|
||||
|
||||
(^8) Eine Überschrifts-Fussnote... so was gibts auch!
|
||||
(^9) Wisse, dass ist eine Fussnote in einem Zitat!
|
||||
(^10) Der Albert Einstein (Fussnote im Zitat, am Ende der Zeile)
|
||||
|
||||
|
202
examples/Flash-Masques-Temperature.md
Normal file
202
examples/Flash-Masques-Temperature.md
Normal file
@ -0,0 +1,202 @@
|
||||
# La vraie température!
|
||||
|
||||
## qui permet de laver en toute sécurité
|
||||
|
||||
## les masques barrières en tissu
|
||||
|
||||
```
|
||||
avec le soutien de
|
||||
```
|
||||
## lF
|
||||
|
||||
## a
|
||||
|
||||
## sh
|
||||
|
||||
## -
|
||||
|
||||
## M
|
||||
|
||||
## asque
|
||||
|
||||
### Etude n° 2
|
||||
|
||||
```
|
||||
Le 26 mars dernier, le guide des spécifi-
|
||||
cations AFNOR S76-001 recommande
|
||||
de laver ses masques à 60°C pendant au
|
||||
moins 30 mn. Comment faire pour ceux et
|
||||
celles qui n’ont pas de lave-linge? L’étude
|
||||
n°1, publiée le 29 mars, teste différents
|
||||
récipients de la vie courante permettant de
|
||||
maintenir de l’eau chaude à 60°C pendant
|
||||
au moins 1/2 heure. Voici l’étude
|
||||
```
|
||||
```
|
||||
CC $
|
||||
BY NC ND
|
||||
```
|
||||
|
||||
```
|
||||
Florence Bost • 06 82 69 89 82 • florence@sablechaud.eu • http://www.sablechaud.eu
|
||||
```
|
||||
### Contexte
|
||||
|
||||
La période de confinement décrétée par le gouvernement français depuis le mardi 17/03/20 et pro-
|
||||
longée jusqu’au 4/05/20 est dans le but de stopper la pandémie du virus COVID-19. Une situation
|
||||
inédite qui laisse le grand public sans réponse face à des gestes de tous les jours pour faire face à la
|
||||
pandémie dans des conditions exceptionnelles de confinement.
|
||||
Le guide des spécifications AFNOR S76-001 sorti en express a pour but d’aider les industriels, et le
|
||||
public à confectionner des masques barrières en tissu le plus adéquat possible. Dans cette norme, il
|
||||
est préconisé de laver pendant 30 mn à 60°C les masques en tissu.
|
||||
|
||||
But de l’expérience :
|
||||
Dans la réalité de tous les jours, plusieurs cas de figures se présentent : les personnes n’ont pas de
|
||||
lave-linge (étudiants, célibataires...), les personnes ont un lave-linge mais ne peuvent ou ne veulent
|
||||
pas lancer une machine que pour les masques, les personnes n’ont pas de thermomètre à disposi-
|
||||
tion. Or l’immédiateté du lavage est important dans le process de non-propagation et destruction du
|
||||
virus. Le lavage à la main est donc une alternative intéressante.
|
||||
Le but de l’expérience dans un premier temps, est de savoir si l’on peut réunir les conditions essen-
|
||||
tielles d’une eau à 60°C pendant 30 mm dans un contexte de confinement.
|
||||
|
||||
Expérience : mesures de l’évolution de la température en fonction du temps dans des récipients
|
||||
ordinaires à usage familial.
|
||||
|
||||
Outils :
|
||||
|
||||
- Un thermomètre de cuisine équipé d’une sonde (HABOR)
|
||||
- Une bouilloire électrique à la température non réglable (RUSSEL-HOBS)
|
||||
- 3 types de récipients sélectionnés : - une casserole inox,
|
||||
- un saladier en porcelaine
|
||||
- un récipient plastique (Tupperware)
|
||||
- Un masque en tissu par récipient
|
||||
|
||||
Protocole de mesure :
|
||||
|
||||
### ETUDE de décroissance de température d’eau chaude
|
||||
|
||||
### en vue du lavage de masques barrières en tissu
|
||||
|
||||
### dans un cadre familial confiné
|
||||
|
||||
|
||||
```
|
||||
Florence Bost • 06 82 69 89 82 • florence@sablechaud.eu • http://www.sablechaud.eu
|
||||
```
|
||||
- Faire bouillir 1,5 l d’eau dans une bouilloire électrique jusqu’à son arrêt automatique
|
||||
- Verser l’eau dans le récipient
|
||||
- Prise de mesure de référence
|
||||
- Relève des mesures toutes les 5 minutes
|
||||
|
||||
```
|
||||
L’expérience a été réalisée par 2 fois et dans 2 cas de figures : - sans couvercle
|
||||
```
|
||||
- avec couvercle (dépose)
|
||||
Notes :
|
||||
T° ambiante de la pièce au moment de l’expérience était de 23°C, aucune fenêtre ouverte.
|
||||
T° de l’eau chaude sortant du robinet de la cuisine : 49°C
|
||||
Taux d’humidité ambiant 42%.
|
||||
Fluide : eau municipale
|
||||
Les mesures de température sont faites en °C.
|
||||
Date et lieu : le 29-03-20 - Paris
|
||||
|
||||
```
|
||||
Résultats :
|
||||
```
|
||||
```
|
||||
La température moyenne constatée en référence est de 92 °C pour les récipients inox et plastique
|
||||
et de 87°C pour la porcelaine. La différence s’explique par le fait que le récipient en porcelaine a une
|
||||
épaisseur plus épaisse que les autres et absorbe dans un premier temps, plus de chaleur.
|
||||
```
|
||||
```
|
||||
Les résultats les plus convaincants sont dans les prises de mesure «récipient + couvercle».
|
||||
Tous les matériaux sont performants, avec une légère supériorité pour le plastique.
|
||||
```
|
||||
```
|
||||
En moyenne, il faut 40 mn pour que la T° arrive à la limite de 60°C. La Température minimale de
|
||||
départ pour tenir 30 mn est d’environ 75°C quelque soit le récipient.
|
||||
```
|
||||
```
|
||||
Conclusion
|
||||
```
|
||||
- La garantie de conserver la température à 60°C pendant 30 mn comme préconisé dans le guide
|
||||
de spécifications AFNOR S76-001 pour le lavage manuel des masques barrière en tissu, peut se
|
||||
faire dans n’importe quel récipient à condition de bien le couvrir.
|
||||
- Si vous avez une bouilloire électrique modulable, vous pouvez régler la température maximale à
|
||||
70°C.
|
||||
|
||||
### ETUDE (suite)
|
||||
|
||||
```
|
||||
NOTE :
|
||||
Cette étude a été réalisée dans la cadre particulier du confinement. L’expérience a été réalisée avec rigueur et reste informa-
|
||||
tive. L’auteur ne peut être rendue responsable de l’interprétation outre mesure des résultats par un tiers.
|
||||
```
|
||||
```
|
||||
mn
|
||||
```
|
||||
|
||||
## Qui sommes-nous?
|
||||
|
||||
## Les liens d’accès
|
||||
|
||||
#### Cette série d’études est à l’initiative de Florence Bost - et de son ré-
|
||||
|
||||
#### seau - soutenue par le pôle de compétitivité Techtera. Certaines d’entre
|
||||
|
||||
#### elles ont participé à l’élaboration du projet MasKaDom piloté par l’IMdR.
|
||||
|
||||
#### Chaque étude complète est disponible sur les sites cités ci-dessous,
|
||||
|
||||
#### ils ne peuvent être utilisés à des fins commerciales sans l’autorisation
|
||||
|
||||
#### expresse et écrite des auteurs.
|
||||
|
||||
```
|
||||
Smart textiles designer, Florence Bost est consultante depuis 2003 sous le nom de
|
||||
Sable Chaud. Elle réalise des cahiers d’idées, des études prospectives, des prestations
|
||||
de création e-textiles et démonstrateurs ainsi que des formations professionnelles e-tex-
|
||||
tiles. Auteur de l’ouvrage «Textiles, innovations et matières actives» édité chez Eyrolles,
|
||||
elle est aussi experte française AFNOR pour la commission de normalisation européenne
|
||||
smart textiles (CEN/TC 248-WG31). Les études sur les masques ont été accompagnées
|
||||
par les expertises complémentaires de :
|
||||
Mattias Ganem
|
||||
Ingénieur textile, IFM, actuellement chef de projet R&D et Développement durable.
|
||||
Jean-Baptiste Chot-Plassot
|
||||
Ingénieur généraliste, IFM, actuellement ingénieur projet innovation - Mode & Textile
|
||||
```
|
||||
```
|
||||
Certaines illustrations ont été réalisées par la styliste et illustratrice Virginie Boy.
|
||||
```
|
||||
```
|
||||
Techtera est le pôle de compétitivité dédié à la filière textile française soutenu par l’Etat,
|
||||
La Direction Générale de l’Armement et les collectivités territoriales. Il anime un réseau de
|
||||
200 membres (entreprises, laboratoires de recherches, centres techniques, universités et
|
||||
grandes écoles), avec pour objectif principal de stimuler la compétitivité par l’innovation
|
||||
collaborative. Depuis 2005, Techtera a ainsi accompagné plus de 215 projets R&D finan-
|
||||
cés pour un budget total de 556,5 millions d’Euros, à destination des marchés d’applica-
|
||||
tion de la santé, des sports et loisirs, du transport, du bâtiment, de la protection et de la
|
||||
sécurité, de l’habillement et de la décoration.
|
||||
```
|
||||
```
|
||||
Techtera - Actualités
|
||||
```
|
||||
```
|
||||
avec le soutien de
|
||||
```
|
||||
## lF
|
||||
|
||||
## a
|
||||
|
||||
## sh
|
||||
|
||||
## -
|
||||
|
||||
## M
|
||||
|
||||
## asque
|
||||
|
||||
```
|
||||
Sable Chaud - COVID-
|
||||
```
|
||||
|
5329
examples/Grammar-Matters.md
Normal file
5329
examples/Grammar-Matters.md
Normal file
File diff suppressed because it is too large
Load Diff
4678
examples/Life-Of-God-In-Soul-Of-Man.md
Normal file
4678
examples/Life-Of-God-In-Soul-Of-Man.md
Normal file
File diff suppressed because it is too large
Load Diff
12159
examples/Made-with-cc.md
Normal file
12159
examples/Made-with-cc.md
Normal file
File diff suppressed because it is too large
Load Diff
2427
examples/Safe-Communication.md
Normal file
2427
examples/Safe-Communication.md
Normal file
File diff suppressed because it is too large
Load Diff
2051
examples/St-Mary-Witney-Social-Audit.md
Normal file
2051
examples/St-Mary-Witney-Social-Audit.md
Normal file
File diff suppressed because it is too large
Load Diff
19273
examples/The-Art-of-Public-Speaking.md
Normal file
19273
examples/The-Art-of-Public-Speaking.md
Normal file
File diff suppressed because it is too large
Load Diff
552
examples/The-Impact-of-Open-Access-Latin-American-Scholarship.md
Normal file
552
examples/The-Impact-of-Open-Access-Latin-American-Scholarship.md
Normal file
@ -0,0 +1,552 @@
|
||||
```
|
||||
Andrew W. Mellon Foundation
|
||||
Grant 1711- 05155
|
||||
December 19 , 2019
|
||||
```
|
||||
```
|
||||
John Kiplinger
|
||||
Valerie Yaw
|
||||
```
|
||||
# The Impact of Open Access
|
||||
|
||||
# Latin American Scholarship:
|
||||
|
||||
## Digitizing the Backlist of El Colegio de
|
||||
|
||||
## México’s Press
|
||||
|
||||
## WHITE PAPER
|
||||
|
||||
|
||||
In 2018, JSTOR received a grant from the Andrew W. Mellon Foundation to support the
|
||||
digitization of out-of-print titles from the Dirección de Publicaciones de El Colegio de
|
||||
México, A.C., as well as the dissemination of those titles on an openly accessible basis.
|
||||
Throughout the year-and-a-half-long project, we worked in deep collaboration with El
|
||||
Colegio de México Press to complete this project. This white paper is intended to
|
||||
document the significance of this work, the process we used to select titles, and what we
|
||||
have learned so far about the usage of these titles on the JSTOR platform. We hope this
|
||||
will help to benefit other initiatives interested in increasing access to out-of-print
|
||||
materials.
|
||||
|
||||
Copyright 2019 ITHAKA. This work is licensed under a Creative Commons Attribution-
|
||||
NonCommercial 4.0 International License. To view a copy of the license, please see
|
||||
[http://creative-commons.org/licenses/by-nc/4.0/.](http://creative-commons.org/licenses/by-nc/4.0/.)
|
||||
|
||||
ITHAKA is interested in disseminating this paper as widely as possible. Please contact us
|
||||
with any questions about using the report at support@jstor.org.
|
||||
|
||||
|
||||
_This project was made possible by The Andrew W. Mellon Foundation. Any views or
|
||||
recommendations expressed in this paper do not necessarily represent those of The
|
||||
Andrew W. Mellon Foundation._
|
||||
|
||||
_The Dirección de Publicaciones de El Colegio de México, A.C. was established in 1938.
|
||||
It offers a catalog of more than 2,400 titles and nine academic journals across the
|
||||
humanities and social sciences._
|
||||
|
||||
_JSTOR, a service of the not-for-profit organization ITHAKA, collaborates with the
|
||||
academic community to help libraries connect students and faculty to vital content
|
||||
while lowering costs and increasing shelf space; provides independent researchers with
|
||||
free and low-cost access to scholarship; and helps publishers reach new audiences and
|
||||
preserve their content for future generations._
|
||||
|
||||
|
||||
JSTOR gratefully acknowledges the contributions and cooperation of the following:
|
||||
|
||||
1. Gabriela Said Reyes, Director, Dirección de Publicaciones de El Colegio de
|
||||
México, A.C.
|
||||
2. Ninel Salcedo Romero, former Director of Marketing, Dirección de
|
||||
Publicaciones de El Colegio de México, A.C.
|
||||
3. Brian Connaughton, Área de Historia Regional y Comparada, Departamento de
|
||||
Filosofia, Universidad Autónoma Metropolitana
|
||||
4. Robert Darnton, Carl H. Pforzheimer University Professor and University
|
||||
Librarian, Emeritus, Harvard University
|
||||
5. Gilbert Joseph, Farnam Professor of History & International Studies, Yale
|
||||
University; Past President, Latin American Studies Association
|
||||
6. Herbert S. Klein, Gouverneur Morris Professor Emeritus of History, Columbia
|
||||
University; former Director of the Center for Latin American Studies and Professor
|
||||
of History at Stanford University; Research Scholar & Latin American Curator,
|
||||
Hoover Institution, Stanford University
|
||||
7. Jocelyn Olcott, Associate Professor, History and Gender, Sexuality & Feminist
|
||||
Studies, Duke University
|
||||
8. William B. Taylor, Muriel McKevitt Sonne Professor of Latin American History,
|
||||
Emeritus, University of California, Berkeley
|
||||
9. Pardha Karamsetty, President, Content & Media Solutions, Apex CoVantage; CEO,
|
||||
Apex CoVantage India
|
||||
10. Prabhanjan Mattam, Project Manager, Apex CoVantage
|
||||
|
||||
|
||||
**Summary**
|
||||
|
||||
In 2018, JSTOR received a grant from the Andrew W. Mellon Foundation to support a
|
||||
collaboration with the Dirección de Publicaciones de El Colegio de México, A.C., the
|
||||
press of El Colegio de México, a graduate research institution in Mexico City^1. This grant
|
||||
enabled JSTOR to digitize nearly 700 books from the press’s backlist in the humanities
|
||||
and humanistic social sciences, and make these books freely and openly available on the
|
||||
JSTOR online platform.
|
||||
|
||||
The goal of this project was to digitize and make openly accessible scholarship from the
|
||||
backlist of El Colegio de Mexico’s Press that would be of significant value to students and
|
||||
researchers in a range of humanities disciplines.
|
||||
|
||||
The work on this project proceeded in three phases, including a preparation and
|
||||
selection process, in which JSTOR worked with experts in the field to determine which
|
||||
books would be digitized; a digitization and ingest phase resulting in the books being
|
||||
hosted openly on JSTOR; and an analysis phase, in which JSTOR sought to develop a
|
||||
better understanding of the impact that foreign-language materials can have when
|
||||
hosted on a global platform.
|
||||
|
||||
This project brought together Colmex’s rich scholarly backlist with JSTOR’s experience
|
||||
managing retrospective digitization projects and helping to increase the impact of
|
||||
academic content by making that content easy to find and use online. Colmex and
|
||||
JSTOR have collaborated over the past several years to make Colmex’s frontlist books
|
||||
available to readers around the world through JSTOR.org. In this project, we sought to
|
||||
build on that collaboration by making a selection of books from the Press’s backlist
|
||||
available in digital form for the first time. In this white paper, we document our process
|
||||
for selection and digitization of books and provide a high-level analysis of usage of the
|
||||
content on the JSTOR platform.
|
||||
|
||||
**Introduction: History, Context, and**
|
||||
|
||||
**Significance of the Collection**
|
||||
|
||||
The press of El Colegio de México has published a body of important scholarship over
|
||||
the course of the last eight decades.
|
||||
|
||||
(^1) Throughout this paper, we generally refer to Dirección de Publicaciones de El Colegio de México, A.C. simply as El
|
||||
Colegio de México or by its common name “Colmex.”
|
||||
|
||||
|
||||
The press was established in 1938 in Mexico City. It attracted a group of pathbreaking
|
||||
scholars in the humanities and social sciences, and Colmex’s press—one of the earliest
|
||||
scholarly publishers in Latin America—provided an outlet for their work, which
|
||||
foregrounded some of the ongoing lines of inquiry in Mexican and Latin American
|
||||
studies, including scholarship on migration to and from Mexico, the interplay between
|
||||
church and state in Latin America, and women’s rights.
|
||||
|
||||
The university’s press published its first title in 1938 and continued to publish significant
|
||||
work throughout its history. The list of the press spans disciplines in the humanities and
|
||||
qualitative social sciences, with special emphases on history, sociology, literary criticism,
|
||||
and political science. For the most part, the books focus on Mexican and Latin American
|
||||
contexts.
|
||||
|
||||
In addition to a robust books program, the press of El Colegio de Mexico publishes seven
|
||||
journals, including _Historia Mexicana,_ arguably the leading journal of Mexican
|
||||
historical studies. Over time, the press has also been an important outlet for making
|
||||
foreign-language writing available in Mexico: as one example, its journal _Diálogos_ was
|
||||
the first to publish Milan Kundera's work in Spanish for a Mexican audience.
|
||||
|
||||
Since 2013, Colmex’s press has published some of its new books in digital form and
|
||||
distributed them through digital scholarly platforms, including JSTOR. Like many
|
||||
established scholarly presses, Colmex licenses access to its frontlist titles to university
|
||||
libraries to help sustain its ongoing publishing program. However, much of Colmex’s
|
||||
backlist was out of print and the press had never digitized it due to limited funding. In
|
||||
today’s increasingly digital landscape, the lack of electronic copies of this important body
|
||||
of scholarly created, in essence, a barrier to accessing those titles.
|
||||
|
||||
This project sought to overcome this barrier and make these books discoverable and
|
||||
accessible for free by a worldwide audience. As noted in the Summary, El Colegio de
|
||||
México and JSTOR have collaborated over the past several years to make Colmex's
|
||||
frontlist books available to readers around the world through JSTOR. In this project, we
|
||||
built on that collaboration, bringing together Colmex's rich scholarly backlist with
|
||||
JSTOR's experience managing retrospective digitization projects and helping to increase
|
||||
the usage of academic content by making that content easy to find and use online.
|
||||
JSTOR has seen high usage and impact for both archival journals and for backlist
|
||||
monographs; in fact, two thirds of ebook usage on JSTOR is for titles published at least
|
||||
three years earlier.
|
||||
|
||||
|
||||
**Our Approach: Selection and Digitization**
|
||||
|
||||
JSTOR digitized nearly 700 titles, or almost 50% of the press’s backlist. Significantly,
|
||||
none of Colmex’s backlist titles were previously available digitally. For every book made
|
||||
available through this project, each page was scanned and OCR processed, and
|
||||
accompanying book and chapter-level metadata was captured to make the books fully
|
||||
searchable, discoverable, and usable for scholars and teachers.
|
||||
|
||||
Selection
|
||||
|
||||
We asked a group of scholar-advisors to help us assess the broader significance of
|
||||
Colmex's list in Mexican and Latin American Studies by drawing our attention to books
|
||||
that are noteworthy and that should be highlighted in outreach about the project to
|
||||
scholars, librarians, students, and general readers.
|
||||
|
||||
Our scholar-advisors assisted with the selection process mainly in two ways. First, they
|
||||
gave us high-level guidance to inform our strategic sense of the collection’s value. One
|
||||
advisor wrote to us that the press's list “[provides] studies of the economic, social,
|
||||
demographic, and political history of Mexico unparalleled by any other publisher.”
|
||||
|
||||
Several of the scholars also noted the broad discipline coverage of Colmex's list; while we
|
||||
expected that the bulk of the books would be of greatest interest to historians, another
|
||||
advisor wrote to us that “[s]ociologists, economists, demographers, linguists and
|
||||
students of literature, geographers, and historians will all benefit by achieving the digital
|
||||
availability of these works.” It is worth noting, as some of our advisors did, that the Press
|
||||
also has a strong list in Asian studies, and the set of titles that we digitized through this
|
||||
project includes books from that area. While the inclusion of these titles may initially
|
||||
seem like an odd fit for a project that focuses for the most part on Mexican and Latin
|
||||
American studies titles, the press's list in Asian studies reflects a critical aspect of the
|
||||
Mexican academy's global engagement. Colmex's Center of Asian and African Studies is,
|
||||
as one adviser noted, “the only functioning center on Asian studies in Latin America,”
|
||||
and Colmex's press, picking up on this strength, has become “the major publisher of
|
||||
studies of Asian history in Spanish.” To the extent that this digitization project is meant
|
||||
in part to reflect the strengths and disciplinary breadth of Colmex's backlist, it seemed
|
||||
important to include these titles in the project.
|
||||
|
||||
Second, while acknowledging the overall value of Colmex's backlist, our advisors also
|
||||
directed us to particular titles that have become classics in their field. For example, some
|
||||
of these titles include Silvio Zavala's multi-volume _El servicio personal de los indios en
|
||||
la Nueva España,_ a study of labor and slavery in the 16th to 18th centuries; books and
|
||||
edited volumes by Andrés Lira on Spanish exiles in Latin America after the Spanish Civil
|
||||
|
||||
|
||||
War; and _Los bienes de la Iglesia en México,_ a study of the conflict between church and
|
||||
state in the 1800s.
|
||||
|
||||
Of particular note among the books we digitized is the _Historia general de Mexico,_ a
|
||||
multi-volume work completed in the 1970s and edited by the Colmex historian Daniel
|
||||
Cosío Villegas. This work covers the range of Mexico's history from the dawn of human
|
||||
habitation. As one reviewer in a scholarly journal noted, Cosío Villegas had a
|
||||
longstanding interest in reaching non-academic audiences, and so the scholars who
|
||||
penned essays for the _Historia general_ were asked to write such that a general audience
|
||||
could read the work. Thus, one project advisor wrote, the volumes are well suited to
|
||||
“students at the high school and university level as well as to adult readers who give
|
||||
them the time and attention they deserve.” Despite the essays being shaped for a non-
|
||||
academic audience, one of our advisors noted that the _Historia general_ remains “the
|
||||
standard general history [of Mexico] used by all scholars.”
|
||||
|
||||
With this guidance in mind, the list of books we digitized resulted from a winnowing
|
||||
process, the stages of which are outlined below^1 :
|
||||
|
||||
```
|
||||
(1) At the start of this project, Colmex had the necessary permissions to digitize and
|
||||
make freely available in digital form a significant number of titles in their backlist,
|
||||
in many cases because the author was a faculty member at Colmex. Given the
|
||||
sizable expense involved in clearing digital rights, we determined that there was
|
||||
significant value in focusing our efforts on books that did not require painstaking
|
||||
rights research. Of the 1,411 titles in the backlist, Colmex's press has distribution
|
||||
rights for 741.
|
||||
```
|
||||
```
|
||||
(2) This list was then refined to exclude a small number of books that were not
|
||||
scholarly in nature (e.g., technical guides from the 1990s). We retained in the list,
|
||||
however, a small number of literary or primary source titles that would be useful
|
||||
for research and teaching.
|
||||
```
|
||||
```
|
||||
(3) The list was further refined to exclude titles that did not fit well with the
|
||||
humanities and humanistic social sciences profile^2. For example, books that
|
||||
focused on environmental policy were considered out of scope for this project.
|
||||
```
|
||||
```
|
||||
(4) Finally, based on cost estimates, we initially aimed to reach a final list of
|
||||
approximately 600 titles. Given cost constraints, we made the difficult choices,
|
||||
including moving approximately 40 social science-leaning titles (many in political
|
||||
```
|
||||
(^1) It is important to note that the winnowing process was undertaken by the project team with guidance from a set of
|
||||
scholar-advisors for the project, given that it was not feasible to ask these advisors, who are also full-time faculty
|
||||
members, to engage in a title-by-title selection process for a list of this size.
|
||||
(^2) This project was funded through the Mellon Foundation’s Humanities Open Book Program, which emphasizes out-of-
|
||||
print humanities books.
|
||||
|
||||
|
||||
```
|
||||
science) to a B-list. It is important to note that, while these titles were not
|
||||
included in the starting list for digitization, lower-than-anticipated costs allowed
|
||||
us to include these titles in our final output. We acknowledge that this initial
|
||||
selection process was not perfect, but we are pleased with the final outcome since
|
||||
these books hold value for humanities researchers (especially historians).
|
||||
```
|
||||
At the end of our initial selection process, we had an A-list of 611 titles. While the vast
|
||||
majority of the books on this list were in history, literature, or other humanities fields,
|
||||
there were also a number of titles that were exceptions. Some titles on the list leaned
|
||||
more toward the social sciences, including a number of books on public policy. We felt
|
||||
that it would be appropriate to include them because they would be of interest to
|
||||
scholars of Mexican and Latin American history. In addition, a handful of titles on the
|
||||
list (fewer than ten) are literary or primary texts (for example, a Spanish-language
|
||||
translation of Giambattista Vico's _Scienzia nuova)._
|
||||
|
||||
Production
|
||||
|
||||
JSTOR's production unit converts over 9 million pages of scholarly journal and book
|
||||
content per year, of which 2 million includes scanning from print sources. We have
|
||||
longstanding relationships with several digitization vendors, and we believed that our
|
||||
experience managing large-scale digitization projects would position us well to
|
||||
accomplish the digitization of Colmex's backlist books quickly, cost-efficiently, and to a
|
||||
high quality.
|
||||
|
||||
For books, JSTOR normally receives and processes PDFs from publishers. These PDFs
|
||||
go through automated workflows at JSTOR’s end as well as processing by a third-party
|
||||
vendor. This project was different because the source document for each book was a
|
||||
print version^1 , and one of the required outputs was an ePub for each book. JSTOR
|
||||
selected one of our current conversion vendors, Apex CoVantage, to handle all vendor
|
||||
processing for the books in the project. This included scanning of the print copy, return
|
||||
shipment of the print copy to Colmex, creation of the PDF from the page images, OCR
|
||||
for creation of searchable full text, metadata capture to JSTOR standard specification for
|
||||
books, and then creation of an ePub. JSTOR negotiated a per page price of $ 0 .83 that
|
||||
covered all these tasks. The project covers 684 books.
|
||||
|
||||
Colmex sent nine shipments of print books to Apex CoVantage’s production facility in
|
||||
Hyderabad, India. The initial batch was shipped mid-April 2018, and the final batch was
|
||||
shipped early-May 2019. Each shipment contained an average of 76 print books. Apex
|
||||
conducted non-destructive scanning with each page scanned as 600 dpi bitonal TIFF
|
||||
|
||||
(^1) Although JSTOR has scanned a relatively small number of print books outside this project, the bulk of our print scanning
|
||||
continues to be for journals. However, the same imaging specifications are used regardless of whether they are journal or
|
||||
book pages.
|
||||
|
||||
|
||||
and grayscale/color content scanned at 300 dpi for RGB TIFF images. We instituted a
|
||||
discrepancy process wherein Apex reported damaged, missing, or other problematic
|
||||
pages to JSTOR. JSTOR assessed these reports and, as needed, worked with Colmex,
|
||||
Harvard University Library, and University of Michigan Library collections to locate and
|
||||
scan replacement pages from other extant copies. The resulting page scans were then
|
||||
used in place of the damaged or otherwise unusable pages or to fill gaps where there
|
||||
were missing pages so that the PDF would represent a complete and intact version of the
|
||||
print original.
|
||||
|
||||
Apex submitted the completed PDFs to JSTOR and shipped the print copies back to
|
||||
Colmex. JSTOR’s systems then ingested the PDF as well as spreadsheet-based supply
|
||||
chain metadata (SCM) provided separately by Colmex. The PDF and SCM were matched
|
||||
by the system and then were automatically sent to Apex for standard processing, which
|
||||
consists of OCR as well as book- and chapter-level metadata capture.
|
||||
|
||||
As Apex completed the standard processing for each book, they then put the books
|
||||
through an ePub creation process that, while very familiar to Apex, was new to JSTOR.
|
||||
The ePubs were created to the EPUB standard version 3.0.1 or higher. Additionally, the
|
||||
processing agreed upon between Apex and JSTOR ensured functionality such as links
|
||||
from footnote anchors in the text block to the footnotes themselves. However, features
|
||||
such as tables were captured as images rather than as HTML. During both the standard
|
||||
processing and ePub creation, Apex occasionally raised metadata capture queries that
|
||||
were reviewed and resolved by JSTOR’s metadata librarian team of Karen Aufdemberge,
|
||||
Emily Betwee, and Rachel Ross, thus ensuring a higher and more consistent quality for
|
||||
the metadata.
|
||||
|
||||
Apex grouped the ePub, the PDF, and the book- and chapter-level metadata XML files
|
||||
into a zip file for delivery to JSTOR. JSTOR systems then ingested the zip file and ran
|
||||
quality control scripts across the files to ensure that they adhered to our specifications.
|
||||
For the initial batch of books, we also conducted a limited amount of manual quality
|
||||
control reviews of the metadata and of the ePub. To accommodate the ingest of these zip
|
||||
files, however, our content management systems staff had to update the JSTOR software
|
||||
to recognize and accept the different directory structure and files that were present (i.e.,
|
||||
the directory containing the ePub as well as the ePub file itself) but had not been present
|
||||
in previous book deliverables from our vendors.
|
||||
|
||||
Furthermore, downstream systems for our content delivery platform had to be updated
|
||||
to recognize and appropriately route and make available the ePub file. JSTOR opted to
|
||||
treat the ePub in a manner similar to that of supplementary materials. The ePub is
|
||||
available as a downloadable file via a clickable “Download EPUB” button at the top of the
|
||||
page for each book in the project. Otherwise, the book is treated in a similar manner to
|
||||
any other Open Access title on JSTOR.
|
||||
|
||||
|
||||
Of the 68 4 books in the project, the first titles became available on the JSTOR site on
|
||||
September 11, 2018. The most recent releases were on July 19, 2019. There are currently
|
||||
four books for which processing cannot be completed because the books do not have
|
||||
ISBN assignments, and the JSTOR systems require an ISBN.
|
||||
|
||||
While this project had typical logistical challenges, the challenges that were new to
|
||||
JSTOR were:
|
||||
|
||||
1. The need to send the books to one particular vendor instead of dividing them
|
||||
equally between our two vendors, which was addressed earlier in these
|
||||
comments; and
|
||||
2. The lack of electronic version ISBN (EISBN) assignments for any of the books.
|
||||
|
||||
ISBN best practice indicates that an electronic version of a book should have an ISBN
|
||||
that is distinct from its print version counterpart. In fact, different electronic versions
|
||||
(e.g., PDF vs. EPUB) can have their own ISBN assignments. However, JSTOR opted to
|
||||
use a single ISBN assignment to cover both electronic versions of each book. Going into
|
||||
the project, Colmex did not have EISBN assignments for the books, and 102 of the books
|
||||
did not have a print version ISBN (PISBN) either. One problem was that, for Mexican-
|
||||
published works, the ISBN are assigned by a third-party agency, and the turnaround
|
||||
times for the assignments, particularly for large batches of requests, are unpredictable.
|
||||
Therefore, to not to jeopardize the overall timeframe for the project, JSTOR opted to use
|
||||
the PSIBN for any book that had a PISBN assignment. This would allow us to ingest the
|
||||
supply chain metadata into JSTOR systems and to keep individual book processing
|
||||
moving beyond the print-scanning stage.
|
||||
|
||||
Meanwhile, Colmex would apply for EISBN assignments attempting to prioritize the
|
||||
assignments for those books that had no ISBN assignment at all. For books that had no
|
||||
ISBN assignment at all, we could move them back into production post-scanning once
|
||||
we had the EISBN assignment. For books that had a PISBN assignment, we plan to do a
|
||||
mass swap of the PISBN for the EISBN once we have all those assignments. At the time
|
||||
this paper was written, 1 28 books were still awaiting an EISBN assignment, including
|
||||
four books that have no ISBN assignment at all and that therefore cannot proceed
|
||||
beyond the scanning stage.
|
||||
|
||||
We are currently planning a project to swap the PISBN for the EISBN for those books
|
||||
where we have the EISBN assignments. We will finish the processing and/or ISBN
|
||||
swaps for the remaining books when the EISBN assignments are available. If we were to
|
||||
do a similar digitization project for backlist books, we would certainly investigate the
|
||||
EISBN situation at the earliest possible stage and work with project partners to secure
|
||||
EISBN assignments as soon as possible.
|
||||
|
||||
|
||||
**Usage: What We’ve Learned So Far**
|
||||
|
||||
This project represented not only an opportunity to digitize and make available books
|
||||
from the publication run of Colmex's list, but also to measure the usage of these books
|
||||
over time and, ultimately, to understand better the impact that foreign-language
|
||||
materials can have when hosted on a globally accessed platform.
|
||||
|
||||
Our objective in measuring usage was to understand how frequently the Colmex books
|
||||
are read online as evidenced by generally accepted metrics such as views and downloads
|
||||
of the chapter files. Additionally, we wanted to understand how this usage compares with
|
||||
the usage of approximately 4 ,500 openly accessible English-language books hosted on
|
||||
JSTOR.
|
||||
|
||||
JSTOR facilitates the discovery of ebook content in a variety of ways. We offer free
|
||||
MARC records to libraries through OCLC, and distribute metadata and full text to
|
||||
discovery services and search engines for indexing. Another important factor in driving
|
||||
usage is co-locating ebook chapters with journal articles on JSTOR’s integrated platform,
|
||||
enabling users to cross-search all types of content at once. For many scholars, JSTOR is a
|
||||
starting point for research—in fact, our traffic referral data shows that more than 40% of
|
||||
visits to ebook pages are by users who were already searching and using JSTOR. Faculty
|
||||
and students are incorporating ebooks into their established research workflows on the
|
||||
platform. In addition, we promoted the availability of the Colmex titles via a short
|
||||
animated video in English and Spanish, email campaigns to librarians and faculty in
|
||||
Latin American studies, announcements shared via JSTOR and Colmex’s web and social
|
||||
media channels, and promotions to members of the Latin American Studies Association,
|
||||
including advertisements and a presentation at the association’s annual conference.
|
||||
|
||||
The Colmex titles digitized through this project have been heavily used on JSTOR. The
|
||||
680 titles made available on JSTOR between September 2018 and July 2019 have been
|
||||
used a total of 502,134 times through October 28, 2019. Every single title has been used.
|
||||
The most-used titles are listed below.
|
||||
|
||||
**Top ten most-used titles**
|
||||
|
||||
```
|
||||
Title Copyright
|
||||
year
|
||||
```
|
||||
```
|
||||
Usage through
|
||||
10/28/
|
||||
```
|
||||
|
||||
```
|
||||
Historia económica general de México: de la
|
||||
colonia a nuestros días
|
||||
```
|
||||
### 2010 13 , 251
|
||||
|
||||
```
|
||||
Historia general de México: volumen I 1994 9 , 323
|
||||
```
|
||||
```
|
||||
Los intelectuales y el poder en México 1991 5 , 156
|
||||
```
|
||||
```
|
||||
De amicitia et doctrina: homenaje a Martha
|
||||
Elena Venier
|
||||
```
|
||||
### 2007 4 , 785
|
||||
|
||||
```
|
||||
La lingüística en México, 1980- 1996 1998 4 , 564
|
||||
```
|
||||
```
|
||||
Diccionario del español usual en México 1996 4 , 503
|
||||
```
|
||||
```
|
||||
Introducción a la historia de la vida cotidiana 2006 4 , 300
|
||||
```
|
||||
```
|
||||
Historia de la lectura en México 1997 3 , 888
|
||||
```
|
||||
```
|
||||
Cuestiones de teoría sociológica 2005 3 , 659
|
||||
```
|
||||
```
|
||||
Historia general de México: volumen II 1994 3 , 595
|
||||
```
|
||||
The data show that there is a broad audience for this scholarship. The titles have been
|
||||
used in 173 countries and territories. While high levels of usage were recorded in
|
||||
Spanish-speaking countries, as we expected, usage also occurred in 161 countries and
|
||||
territories where Spanish is not a national or official language. The map below shows the
|
||||
countries in which we have recorded usage for the Colmex titles, and the table lists the
|
||||
ten countries with the highest usage.
|
||||
|
||||
|
||||
## Top ten countries that recorded the most usage
|
||||
|
||||
- Country Usage through 10/28/
|
||||
- Mexico 151 ,
|
||||
- United States 54 ,
|
||||
- Colombia 29 ,
|
||||
- Spain 17 ,
|
||||
- Argentina 13 ,
|
||||
- Peru 11 ,
|
||||
- Chile 9 ,
|
||||
|
||||
|
||||
```
|
||||
Ecuador 9 , 143
|
||||
```
|
||||
```
|
||||
Costa Rica 4 , 770
|
||||
```
|
||||
```
|
||||
United Kingdom 3 , 580
|
||||
```
|
||||
Because JSTOR works with thousands of institutions around the world, we can measure
|
||||
the usage of these titles at institutions that participate in our services. We recorded usage
|
||||
of the Colmex titles at 4,285 institutions. This included not only college and universities,
|
||||
but also community colleges, secondary schools, government and not-for-profit
|
||||
organizations, and public libraries.
|
||||
|
||||
JSTOR’s ebook program had not previously hosted EPUB files; for this project, we added
|
||||
the capability for users to download the full book as an EPUB file from the table of
|
||||
contents page, as well as the standard option to view or download chapter-level PDFs.
|
||||
There were 19,234 downloads of EPUB files for the Colmex titles through the end of
|
||||
October 2019—just 3.8% of the total usage of the titles in that timeframe.
|
||||
|
||||
This project also gave us the opportunity to compare the usage of Spanish and English-
|
||||
language titles available on JSTOR. On average, the Colmex ebooks are used 57% as
|
||||
much as the Open Access titles in English on the platform. While there are other
|
||||
variables that may affect the level of usage (such as discipline or copyright year), this
|
||||
figure shows an impressive amount of usage of Spanish-language titles on a primarily
|
||||
English-language scholarly content site.
|
||||
|
||||
We’ve also received positive feedback from librarians and scholars regarding the access
|
||||
to this content. For example, responses to the news on Twitter included praise for the
|
||||
initiative (“Excelente noticia para @elcolmex y el ámbito académico de México y el
|
||||
mundo”) and recommendations of specific titles (“Una de las joyas liberadas en acceso
|
||||
abierto [PDF / EPUB] por el Colmex a través de Jstor es /Los intelectuales y el poder en
|
||||
México/ (1991) un nutrido volumen colectivo que contiene muy buenas intervenciones,
|
||||
algunas de ellas referencias obligadas.”)
|
||||
|
||||
**Conclusion**
|
||||
|
||||
As a result of this project, 68 0 significant works of scholarship (with four more coming
|
||||
when EISBN assignments are available) that were previously out of print are now
|
||||
|
||||
|
||||
available to anyone who wishes to use them. They are easy to discover and access within
|
||||
researchers’ existing digital workflows. The value of these titles is apparent in the strong
|
||||
usage we’ve seen over the relatively short period they’ve been available: more than half a
|
||||
million views and downloads across 173 countries. Scholars and students in Latin
|
||||
America and around the world are enriching their research with this content, and we
|
||||
have ensured that it will be available to future generations.
|
||||
|
||||
In addition, the Mexican government launched a project earlier this year, the Estrategia
|
||||
Nacional de Lectura, to promote reading and guarantee that books are accessible to the
|
||||
entire population. The 684 digitized titles will be openly available to the Mexican people
|
||||
and promoted as part of this project.
|
||||
|
||||
This project also built a foundation for continued work on the Open Access
|
||||
dissemination of Latin American scholarship. JSTOR is currently participating in a pilot
|
||||
led by the Latin American Research Resources Project (LARRP), a consortium of
|
||||
research libraries that is funding the Open Access distribution of 200 titles published in
|
||||
2018 - 2019 by the Latin American Council of Social Sciences (CLACSO). This initiative,
|
||||
developed and supported by libraries, will test a framework for the sustainable, long-
|
||||
term stewardship of Open Access scholarly monographs.
|
||||
|
||||
We are grateful that the Humanities Open Book program grant funded by The Andrew
|
||||
W. Mellon Foundation provided the opportunity for JSTOR to partner with El Colegio de
|
||||
México to make its important scholarship available for researchers around the world to
|
||||
discover and use. We look forward to continuing to build on what we’ve achieved
|
||||
together.
|
||||
|
||||
|
400
examples/The-Man-Without-A-Body.md
Normal file
400
examples/The-Man-Without-A-Body.md
Normal file
@ -0,0 +1,400 @@
|
||||
```
|
||||
{from} THE {New York} SUN, SUNDAY, MARCH 25, 1877.
|
||||
```
|
||||
## THE MAN WITHOUT A BODY
|
||||
|
||||
```
|
||||
{by Edward Page Mitchell}
|
||||
```
|
||||
On a shelf in the old Arsenal museum, in the
|
||||
Central Park, in the midst of stuffed
|
||||
hummingbirds, ermines, silver foxes, and
|
||||
bright- colored parakeets, there is a ghastly row
|
||||
of human heads. I pass by the mummied
|
||||
Peruvian, the Maori chief, and the Flathead
|
||||
Indian to speak of a Caucasian head which has
|
||||
had a fascinating interest to me ever since it was
|
||||
added to the grim collection a little more than a
|
||||
year ago.
|
||||
I was struck with the Head when I first saw it.
|
||||
The pensive intelligence of the features won
|
||||
me. The face is remarkable, although the nose
|
||||
is gone, and the nasal fossæ are somewhat the
|
||||
worse for wear. The eyes are likewise wanting,
|
||||
but the empty orbs have an expression of their
|
||||
own. The parchmenty skin is so shriveled that
|
||||
the teeth show to their roots in the jaws. The
|
||||
mouth has been much affected by the ravages
|
||||
of decay, but what mouth there is displays
|
||||
character. It seems to say: "Barring certain
|
||||
deficiencies in my anatomy, you behold a man
|
||||
of parts!" The features of the Head are of the
|
||||
Teutonic cast, and the skull is the skull of a
|
||||
philosopher. What particularly attracted my
|
||||
attention, however, was the vague resemblance
|
||||
which this dilapidated countenance bore to
|
||||
some face which had at some time been familiar
|
||||
to me **—** some face which lingered in my
|
||||
memory, but which I could not place.
|
||||
After all, I was not greatly surprised, when I
|
||||
had known the Head for nearly a year, to see it
|
||||
acknowledge our acquaintance and express its
|
||||
appreciation of friendly interest on my part by
|
||||
deliberately winking at me as I stood before its
|
||||
glass case.
|
||||
This was on a Trustees' day, and I was the
|
||||
only visitor in the hall. The faithful attendant
|
||||
had gone to enjoy a can of beer with his friend,
|
||||
the superintendent of the monkeys.
|
||||
The Head winked a second time, and even
|
||||
more cordially than before. I gazed upon its
|
||||
efforts with the critical delight of an anatomist.
|
||||
I saw the masseter muscle flex beneath the
|
||||
leathery skin. I saw the play of the buccinators,
|
||||
and the beautiful lateral movement of the
|
||||
internal pterygoid. I knew the Head was trying
|
||||
to speak to me. I noted the convulsive
|
||||
twitchings of the risorius and the zygomatie
|
||||
|
||||
```
|
||||
major, and knew that it was endeavoring to
|
||||
smile.
|
||||
"Here," I thought, "is either a case of vitality
|
||||
long after decapitation, or, an instance of reflex
|
||||
action where there is no diastaltic or excitor-
|
||||
motory system. In either case the phenomenon
|
||||
is unprecedented, and should be carefully
|
||||
observed. Besides, the Head is evidently well
|
||||
disposed toward me." I found a key on my
|
||||
bunch which opened the glass door.
|
||||
"Thanks," said the Head. "A breath of fresh
|
||||
air is quite a treat."
|
||||
"How do you feel?" I asked politely. "How
|
||||
does it seem without a body?"
|
||||
The Head shook itself sadly and sighed. "I
|
||||
would give," it said, speaking through its
|
||||
ruined nose, and for obvious reasons using
|
||||
chest tones sparingly, "I would give both ears
|
||||
for a single leg. My ambition is principally
|
||||
ambulatory, and yet I cannot walk. I cannot
|
||||
even hop or waddle. I would fain travel, roam,
|
||||
promenade, circulate in the busy paths of men,
|
||||
but I am chained to this accursed shelf. I am no
|
||||
better off than these barbarian heads — I, a man
|
||||
of science! I am compelled to sit here on my
|
||||
neck and see sandpipers and storks all around
|
||||
me, with legs and to spare. Look at that infernal
|
||||
little Oedieneninus Longpipes over there. Look
|
||||
at that miserable Gray-headed Porphyrio. They
|
||||
have no brains, no ambition, no yearnings. Yet
|
||||
they have legs, legs, legs in profusion." He cast
|
||||
an envious glance across the alcove at the
|
||||
tantalizing limbs of the birds in question, and
|
||||
added gloomily, "There isn't even enough of
|
||||
me to make a hero for one of Wilkie Collins's
|
||||
novels."
|
||||
I did not exactly know how to console him in
|
||||
so delicate a manner, but ventured to hint that
|
||||
perhaps his condition had its compensations in
|
||||
immunity from corns and the gout.
|
||||
"And as to arms," he went on, "there's
|
||||
another misfortune for you! I am unable to
|
||||
brush away the flies that get in here — Lord
|
||||
knows how — in the summertime. I cannot
|
||||
reach over and cuff that confounded Chinook
|
||||
mummy that sits there grinning at me like a
|
||||
jack-in-the-box. I cannot scratch my head or
|
||||
even blow my nose [his nose!] decently when I
|
||||
get cold in this thundering draught. As to eating
|
||||
```
|
||||
|
||||
and drinking, I don't care. My soul is wrapped
|
||||
up in Science. Science is my bride, my divinity.
|
||||
I worship her footsteps in the past, and hail the
|
||||
prophecy of her future progress. I **—** "
|
||||
I had heard these sentiments before. In a flash
|
||||
I had accounted for the familiar look which had
|
||||
haunted me ever since I first saw the Head.
|
||||
"Pardon me," I said, "you are the celebrated
|
||||
Prof. Dummkopf?"
|
||||
"That is, or was, my name," he replied, with
|
||||
dignity.
|
||||
"And you formerly lived in Boston, where you
|
||||
carried on scientific experiments of startling
|
||||
originality. It was you who first discovered how
|
||||
to photograph smell, how to bottle music, how
|
||||
to freeze the aurora borealis. It was you who first
|
||||
applied spectrum analysis to Mind."
|
||||
"These were some of my minor
|
||||
achievements," said the Head, sadly nodding
|
||||
itself **—** " small when compared with my final
|
||||
invention, the grand discovery which was at the
|
||||
same time my greatest triumph and my ruin. I
|
||||
lost my Body in an experiment."
|
||||
"How was that?" I asked. "I had not heard."
|
||||
"No," said the Head. "Living alone and
|
||||
friendless, my disappearance was hardly
|
||||
noticed. I will tell you **—** "
|
||||
There was a sound upon the stairway.
|
||||
"Hush!" cried the Head. "Here comes
|
||||
somebody. We must not be discovered. You
|
||||
must dissemble."
|
||||
I hastily closed the door of the glass case,
|
||||
locked it just in time to evade the vigilance of
|
||||
the returning keeper, and dissembled by
|
||||
pretending to examine, with great interest, Anas
|
||||
Acuta, or Pin-tailed Duck.
|
||||
On the next Trustees' day I revisited the
|
||||
Museum and gave the keeper of the Head a
|
||||
dollar on the pretense of purchasing
|
||||
information in regard to the curiosities in his
|
||||
charge. He made the circuit of the hall with me,
|
||||
talking volubly all the while.
|
||||
"That there," he said, as we stood before the
|
||||
Head, "is a relict of morality presented to the
|
||||
Museum fifteen months ago. The head of a
|
||||
notorious murderer gilteened at Paris in the last
|
||||
century, sir."
|
||||
I fancied that I saw a slight twitching about
|
||||
the corners of Prof. Dummkopf **’** s mouth and an
|
||||
almost imperceptible depression of what was
|
||||
once his left eyelid, but he kept his face
|
||||
remarkably well under the circumstances. I
|
||||
|
||||
```
|
||||
dismissed my guide with many thanks for his
|
||||
intelligent services, and, as I had anticipated, he
|
||||
departed forthwith to invest his easily earned
|
||||
dollar in beer, leaving me to pursue my
|
||||
conversation with the Head.
|
||||
"Think of putting a wooden-headed idiot like
|
||||
that," said the Professor, after I had opened his
|
||||
glass prison, "in charge of a portion, however
|
||||
small, of a man of science — of the inventor of
|
||||
the Telepomp! Paris! Murderer! Last century,
|
||||
indeed!" and the Head shook with laughter
|
||||
until I feared that it would tumble off the shelf.
|
||||
"You spoke of your invention, the
|
||||
Telepomp," I suggested.
|
||||
"Ah, yes," said the Head, simultaneously
|
||||
recovering its gravity and its center of gravity;
|
||||
"I promised to tell you how I happen to be a
|
||||
Man without a Body. You see that some three
|
||||
or four years ago I discovered the principle of
|
||||
the transmission of sound by electricity. My
|
||||
Telephone, as I called it, would have been an
|
||||
invention of great practical utility if I had been
|
||||
spared to introduce it to the public. But, alas-"
|
||||
"Excuse the interruption," I said, "but I must
|
||||
inform you that somebody else has recently
|
||||
accomplished the same thing. The Telephone
|
||||
is a realized fact."
|
||||
"Have they gone any further?" he eagerly
|
||||
asked. "Have they discovered the great secret
|
||||
of the transmission of atoms? In other words,
|
||||
have they accomplished the Telepomp?"
|
||||
"I have heard nothing of the kind," I hastened
|
||||
to assure him, "but what do you mean?"
|
||||
"Listen," he said. "In the course of my
|
||||
experiments with the Telephone I became
|
||||
convinced that the same principle was capable
|
||||
of indefinite expansion. Matter is made up of
|
||||
molecules, and molecules, in their turn, are
|
||||
made up of atoms. The atom, you know, is the
|
||||
unit of being. The molecules differ according to
|
||||
the number and the arrangement of their
|
||||
constituent atoms. Chemical changes are
|
||||
effected by the dissolution of the atoms in the
|
||||
molecules and their rearrangements into
|
||||
molecules of another kind. This dissolution
|
||||
may be accomplished by chemical affinity or by
|
||||
a sufficiently strong electric current. Do you
|
||||
follow me?"
|
||||
"Perfectly."
|
||||
"Well, then, following out this line of thought,
|
||||
I conceived a great idea. There was no reason
|
||||
why matter could not be telegraphed, or, to be
|
||||
```
|
||||
|
||||
etymologically accurate, 'telepomped.' It was
|
||||
only necessary to effect at one end of the line the
|
||||
disintegration of the molecules into atoms, and
|
||||
to convey the vibrations of the chemical
|
||||
dissolution by electricity to the other pole,
|
||||
where a corresponding reconstruction could be
|
||||
effected from other atoms. As all atoms are
|
||||
alike, their arrangement into molecules of the
|
||||
same order, and the arrangement of those
|
||||
molecules into an organization similar to the
|
||||
original organization, would be practically a
|
||||
reproduction of the original. It would be a
|
||||
materialization **—** not in the sense of the
|
||||
Spiritualists' cant, but in all the truth and logic
|
||||
of stern science. Do you still follow me?"
|
||||
"It is a little misty," I said, "but I think I get
|
||||
the point. You would telegraph the Idea of the
|
||||
matter, to use the word Idea in Plato's sense."
|
||||
"Precisely. A candle flame is the same candle
|
||||
flame although the burning gas is continually
|
||||
changing. A wave on the surface of water is the
|
||||
same wave, although the water composing it is
|
||||
shifting as it moves. A man is the same man
|
||||
although there is not an atom in his body which
|
||||
was there five years before. It is the Form, the
|
||||
Shape, the Idea, that is essential. The vibrations
|
||||
that give individuality to matter may be
|
||||
transmitted to a distance by wire just as readily
|
||||
as the vibrations that give individuality to
|
||||
sound. So I constructed an instrument by which
|
||||
I could pull down matter, so to speak, at the
|
||||
anode and build it up again on the same plan at
|
||||
the cathode. This was my Telepomp."
|
||||
"But in practice **—** how did the Telepomp
|
||||
work?"
|
||||
"To perfection! In my rooms on Joy street, in
|
||||
Boston, I had about five miles of wire. I had no
|
||||
difficulty in sending simple compounds, such
|
||||
as quartz, starch, and water, from one room to
|
||||
another over this five-mile coil. I shall never
|
||||
forget the joy with which I disintegrated a three-
|
||||
cent postage stamp in one room and found it
|
||||
immediately reproduced at the receiving
|
||||
instrument in another. This success with
|
||||
inorganic matter emboldened me to attempt the
|
||||
same thing with a living organism. I caught a
|
||||
cat **—** a black and yellow cat **—** and I submitted
|
||||
him to a terrible current from my two-hundred-
|
||||
cup battery. The cat disappeared in a twinkling.
|
||||
I hastened to the next room and, to my immense
|
||||
satisfaction, found Thomas there, alive and
|
||||
|
||||
```
|
||||
purring, although somewhat astonished. It
|
||||
worked like a charm."
|
||||
"This is certainly very remarkable."
|
||||
"Isn't it? After my experiment with the cat, a
|
||||
gigantic idea took possession of me. If I could
|
||||
send a feline being, why not send a human
|
||||
being? If I could transmit a cat five miles by
|
||||
wire in a flash of electricity, why not transmit a
|
||||
man to London by Atlantic cable and with equal
|
||||
despatch? I resolved to strengthen my already
|
||||
powerful battery and try the experiment. Like a
|
||||
thorough votary of science, I resolved to try the
|
||||
experiment on myself.
|
||||
"I do not like to dwell upon this chapter of my
|
||||
experience," continued the Head, winking at a
|
||||
tear which had trickled down on to his cheek
|
||||
and which I silently wiped away for him with my
|
||||
own pocket handkerchief. "Suffice it that
|
||||
I trebled the cups in my battery, stretched my
|
||||
wire over housetops to my lodgings in Phillips
|
||||
street, made everything ready, and with a
|
||||
solemn calmness born of my confidence in the
|
||||
theory, placed myself in the receiving
|
||||
instrument of the Telepomp at my Joy street
|
||||
office. I was as sure that when I made the
|
||||
connection with the battery I would find myself
|
||||
in my rooms in Phillips street as I was sure of
|
||||
my existence. Then I touched the key that let on
|
||||
the electricity. Alas!"
|
||||
For some moments my friend was unable to
|
||||
speak. At last, with an effort, he resumed his
|
||||
narrative.
|
||||
"I began to disintegrate at my feet and slowly
|
||||
disappeared under my own eyes. My legs
|
||||
melted away, and then my trunk and arms. That
|
||||
something was wrong, I knew from the
|
||||
exceeding slowness of my dissolution, but I was
|
||||
helpless. Then my head went and I lost all
|
||||
consciousness. According to my theory, my
|
||||
head, having been the last to disappear, should
|
||||
have been the first to materialize at the other
|
||||
end of the wire. The theory was confirmed in
|
||||
fact. I recovered consciousness. I opened my
|
||||
eyes in my Phillips street apartments. My chin
|
||||
was materializing, and with great satisfaction I
|
||||
saw my neck slowly taking shape. Suddenly,
|
||||
and about at the third cervical vertebra, the
|
||||
process stopped. In a flash I knew the reason. I
|
||||
had forgotten to replenish the cups of my
|
||||
battery with fresh sulphuric acid, and there was
|
||||
not electricity enough to materialize the rest of
|
||||
```
|
||||
|
||||
me. I was a Head, but my body was, Lord
|
||||
knows where!"
|
||||
I did not attempt to offer consolation. Words
|
||||
would have been mockery in the presence of
|
||||
Prof. Dummkopf's grief.
|
||||
"What matters it about the rest?" he sadly
|
||||
continued. "The house in Phillips Street was
|
||||
full of medical students. I suppose that some of
|
||||
them found my Head, and knowing nothing of
|
||||
me or of the Telepomp, appropriated it for
|
||||
purposes of anatomical study. I suppose that
|
||||
they attempted to preserve it by means of some
|
||||
arsenical preparation. How badly the work was
|
||||
done is shown by my defective nose. I suppose
|
||||
that I drifted from medical student to medical
|
||||
student, and from anatomical cabinet to
|
||||
anatomical cabinet until some would-be
|
||||
humorist presented me to this collection as a
|
||||
French murderer of the last century. For some
|
||||
months I knew nothing, and when I recovered
|
||||
consciousness I found myself here.
|
||||
"Such," added the Head, with a dry, harsh
|
||||
laugh, "is the irony of Fate!"
|
||||
"Is there nothing I can do for you?" I asked,
|
||||
after a pause.
|
||||
"Thank you," the Head replied; "I am
|
||||
tolerably cheerful and resigned. I have lost
|
||||
pretty much all interest in experimental
|
||||
Science. I sit here day after day and watch the
|
||||
objects of zoological, ichthyological,
|
||||
ethnological, and conchological interest with
|
||||
which this admirable museum abounds. I don't
|
||||
know of anything you can do for me.
|
||||
"Stay," he added, as his gaze fell once more
|
||||
upon the exasperating legs of the Oedieneninus
|
||||
Longpipes opposite him. "If there is anything I
|
||||
do feel the need of, it is out-door exercise.
|
||||
Couldn't you manage in some way to take me
|
||||
out for a walk?"
|
||||
I confess that I was somewhat staggered by
|
||||
this request, but promised to do what I could.
|
||||
After some deliberation, I formed a plan, which
|
||||
was carried out in the following manner:
|
||||
I returned to the Museum that afternoon just
|
||||
before the closing hour, and hid myself behind
|
||||
the mammoth sea cow, or Manatus
|
||||
Americanus. The attendant, after a cursory
|
||||
glance through the hall, locked up the building
|
||||
and departed. Then I came boldly forth and
|
||||
removed my friend from his shelf. With a piece
|
||||
of stout twine, I lashed his one or two vertebrae
|
||||
to the headless vertebrae of a skeleton Moa.
|
||||
|
||||
```
|
||||
This gigantic and extinct bird of New Zealand
|
||||
is heavy legged, full breasted, tall as a man, and
|
||||
has huge, sprawling feet. My friend, thus
|
||||
provided with legs and arms, manifested
|
||||
extraordinary glee. He walked about, stamped
|
||||
his big feet, swung his wings, and occasionally
|
||||
broke forth into an hilarious shuffle. I was
|
||||
obliged to remind him that he must support the
|
||||
dignity of the venerable bird whose skeleton he
|
||||
had borrowed. I despoiled the African lion of his
|
||||
glass eyes, and inserted them in the empty
|
||||
orbits of the Head. I gave Prof. Dummkopf a
|
||||
Fiji war lance for a walking stick, covered him
|
||||
with a Sioux blanket, and then we issued forth
|
||||
from the old Arsenal into the fresh night air and
|
||||
the moonlight, and wandered arm in arm along
|
||||
the shores of the quiet lake and through the
|
||||
mazy paths of the Ramble.
|
||||
```
|
||||
## {THE END}
|
||||
|
||||
|
7317
examples/The-War-of-the-Worlds.md
Normal file
7317
examples/The-War-of-the-Worlds.md
Normal file
File diff suppressed because it is too large
Load Diff
2250
examples/Tragedy-Of-The-Commons.md
Normal file
2250
examples/Tragedy-Of-The-Commons.md
Normal file
File diff suppressed because it is too large
Load Diff
31237
examples/Watered-Soul-Blog-Book.md
Normal file
31237
examples/Watered-Soul-Blog-Book.md
Normal file
File diff suppressed because it is too large
Load Diff
10420
examples/WoodUp.md
Normal file
10420
examples/WoodUp.md
Normal file
File diff suppressed because it is too large
Load Diff
1881
examples/compressed.tracemonkey-pldi-09.md
Normal file
1881
examples/compressed.tracemonkey-pldi-09.md
Normal file
File diff suppressed because it is too large
Load Diff
13903
examples/dict.md
Normal file
13903
examples/dict.md
Normal file
File diff suppressed because it is too large
Load Diff
@ -23,10 +23,7 @@ export default class DetectCodeQuoteBlocks extends ItemTransformer {
|
||||
const codeBlockItems = new Set<string>();
|
||||
let foundCodeItems = 0;
|
||||
|
||||
groupByPage(inputItems).forEach((pageItems, pageIdx) => {
|
||||
if (pageIdx === 5) {
|
||||
console.log(pageItems[0].data['str']);
|
||||
}
|
||||
groupByPage(inputItems).forEach((pageItems) => {
|
||||
const minX = toMinX(pageItems);
|
||||
groupByBlock(pageItems).forEach((blockItems) => {
|
||||
if (!blockItems[0].data['types'] && looksLikeCodeBlock(minX, blockItems, mostUsedHeight)) {
|
||||
|
@ -20,6 +20,7 @@ import DetectToc, { TOC_GLOBAL } from 'src/transformer/DetectToc';
|
||||
import DetectHeaders from 'src/transformer/DetectHeaders';
|
||||
import TOC from 'src/TOC';
|
||||
import { getText } from 'src/support/items';
|
||||
import MarkdownConverter from 'src/convert/MarkdownConverter';
|
||||
|
||||
pdfjs.GlobalWorkerOptions.workerSrc = `pdfjs-dist/es5/build/pdf.worker.min.js`;
|
||||
|
||||
@ -52,6 +53,13 @@ describe.each(files)('Test %p', (file) => {
|
||||
debug = await pipeline.parse(data, () => {}).then((pc) => pc.debug());
|
||||
});
|
||||
|
||||
test('Compare Markdown', async () => {
|
||||
const lastStage = debug.stageResult(debug.stageNames.length - 1);
|
||||
const items = lastStage.itemsCleanedAndUnpacked();
|
||||
const text = new MarkdownConverter().convert(items);
|
||||
expect(text).toMatchFile(markdownFilePath(file));
|
||||
});
|
||||
|
||||
test.each(transformers.map((t) => t.name).filter((name) => name !== 'Does nothing'))(
|
||||
'stage %p',
|
||||
(transformerName) => {
|
||||
@ -81,7 +89,9 @@ describe.each(files)('Test %p', (file) => {
|
||||
try {
|
||||
chunkedLines.forEach((lines, idx) => {
|
||||
const transformerResultAsString = lines.join('\n') || '{}';
|
||||
expect(transformerResultAsString).toMatchFile(matchFilePath(file, transformerName, chunkedLines.length, idx));
|
||||
expect(transformerResultAsString).toMatchFile(
|
||||
transformedFilePath(file, transformerName, chunkedLines.length, idx),
|
||||
);
|
||||
});
|
||||
} finally {
|
||||
stageResult.globals.keys().forEach((globalKey) => {
|
||||
@ -92,20 +102,31 @@ describe.each(files)('Test %p', (file) => {
|
||||
);
|
||||
});
|
||||
|
||||
function matchFilePath(pdfFileName: string, transformerName: string, chunkCount = 1, chunkIndex = 0): string {
|
||||
function transformedFilePath(pdfFileName: string, transformerName: string, chunkCount = 1, chunkIndex = 0): string {
|
||||
const pdfFileNameWithoutExtension = pdfFileName.substr(0, pdfFileName.length - 4);
|
||||
const resultFileName = `${transformerName[0].toLowerCase() + transformerName.slice(1).replace(/\s/g, '')}`;
|
||||
const fileIndex = chunkCount > 1 ? `.${chunkIndex}` : '';
|
||||
return `${folder}/${pdfFileNameWithoutExtension}/${resultFileName}${fileIndex}.json`;
|
||||
}
|
||||
|
||||
describe('Selective transforms on URL PDFs', () => {
|
||||
const transformerNames = [new RemoveRepetitiveItems().name, new DetectToc().name, new DetectHeaders().name];
|
||||
test.each(urls)('URL %p', async (url) => {
|
||||
const { fileName, data } = download(url);
|
||||
function markdownFilePath(pdfFileName: string): string {
|
||||
const pdfFileNameWithoutExtension = pdfFileName.substr(0, pdfFileName.length - 4);
|
||||
return `${folder}/${pdfFileNameWithoutExtension}.md`;
|
||||
}
|
||||
|
||||
const transformerNames = [new RemoveRepetitiveItems().name, new DetectToc().name, new DetectHeaders().name];
|
||||
describe.each(urls)('Test URL %p', (url) => {
|
||||
const { fileName, data } = download(url);
|
||||
|
||||
test(`markdown from ${url}`, async () => {
|
||||
const transformResult = await pipeline.parse(data, () => {}).then((pc) => pc.transform());
|
||||
const text = transformResult.convert(new MarkdownConverter());
|
||||
expect(text).toMatchFile(markdownFilePath(fileName));
|
||||
});
|
||||
|
||||
test(`stages from ${url}`, async () => {
|
||||
const debug = await pipeline.parse(data, () => {}).then((pc) => pc.debug());
|
||||
const printedGlobals = new Set<string>();
|
||||
|
||||
transformerNames.forEach((transformerName) => {
|
||||
const stageResult = debug.stageResult(debug.stageNames.indexOf(transformerName));
|
||||
const pages = stageResult.selectPages(true, true);
|
||||
@ -124,7 +145,7 @@ describe('Selective transforms on URL PDFs', () => {
|
||||
);
|
||||
|
||||
const transformerResultAsString = lines.join('\n') || '{}';
|
||||
expect(transformerResultAsString).toMatchFile(matchFilePath(fileName, transformerName));
|
||||
expect(transformerResultAsString).toMatchFile(transformedFilePath(fileName, transformerName));
|
||||
|
||||
stageResult.globals.keys().forEach((globalKey) => {
|
||||
printedGlobals.add(globalKey);
|
||||
|
Loading…
Reference in New Issue
Block a user