Я разрабатываю приложение PHP для создания базы данных записей BibTex, которые я собираю с веб-сайта. Проблема, с которой я сталкиваюсь, заключается в том, что когда я загружаю содержимое веб-страницы, я получаю этот нежелательный контент (включая пробелы в начале файла):
BibTeX bibliography toplas.bib
%%% -*-BibTeX-*-
%%% ====================================================================
%%% BibTeX-file{
%%% author-1 = "Preston Briggs",
%%% author-2 = "Nelson H. F. Beebe",
%%% version = "2.115",
%%% date = "21 January 2015",
%%% time = "07:15:05 MDT",
%%% filename = "toplas.bib",
%%% address-1 = "Tera Computer Company
%%% 2815 Eastlake East
%%% Seattle, WA 98102
%%% USA",
%%% address-2 = "University of Utah
%%% Department of Mathematics, 110 LCB
%%% 155 S 1400 E RM 233
%%% Salt Lake City, UT 84112-0090
%%% USA",
%%% telephone-1 = "+1 206 325-0800",
%%% telephone-2 = "+1 801 581 5254",
%%% FAX-2 = "+1 801 581 4148",
%%%
%%% codetable = "ISO/ASCII",
%%% keywords = "bibliography, BibTeX, ACM Transactions on
%%% Programming Languages and Systems, TOPLAS",
%%% license = "public domain",
%%% supported = "yes",
%%% docstring = "This is a COMPLETE bibliography of the journal
%%% ACM Transactions on Programming Languages and
%%% Systems (CODEN ATPSDT, ISSN 0164-0925
%%% (print), 1558-4593 (electronic)), informally
%%% known as TOPLAS, covering volumes 1--21
%%% (1988--1999).
%%%
%%% The publisher maintains a World Wide Web site
%%% for this journal at
%%%
%%%
%%% At version 2.115, the year coverage looked
%%% like this:
%%%
%%% 1979 ( 20) 1992 ( 22) 2005 ( 33)
%%% 1980 ( 33) 1993 ( 32) 2006 ( 27)
%%% 1981 ( 28) 1994 ( 66) 2007 ( 45)
%%% 1982 ( 39) 1995 ( 39) 2008 ( 33)
%%% 1983 ( 36) 1996 ( 29) 2009 ( 23)
%%% 1984 ( 34) 1997 ( 35) 2010 ( 21)
%%% 1985 ( 34) 1998 ( 34) 2011 ( 21)
%%% 1986 ( 26) 1999 ( 32) 2012 ( 17)
%%% 1987 ( 27) 2000 ( 28) 2013 ( 14)
%%% 1988 ( 32) 2001 ( 14) 2014 ( 14)
%%% 1989 ( 30) 2002 ( 21) 2015 ( 5)
%%% 1990 ( 28) 2003 ( 20)
%%% 1991 ( 30) 2004 ( 28)
%%%
%%% Article: 1049
%%% TechReport: 1
%%%
%%% Total entries: 1050
%%%
%%% This bibliography was initially constructed
%%% by hand by the first author (PB) from
%%% various sources, and at its last release in
%%% February 1995, had 447 entries.
%%%
%%% It was further extended by the second
%%% author (NHFB) using bibliographies in
%%% NHFB's personal files, from the OCLC
%%% Contents1st database, from the IEEE INSPEC
%%% database, from the computer graphics
%%% bibliography archive at ftp.siggraph.org,
%%% and from the computer science bibliography
%%% collection on ftp.ira.uka.de in
%%% /pub/bibliography to which many people of
%%% have contributed. The snapshot of this
%%% collection was taken on 5-May-1994, and it
%%% consists of 441 BibTeX files, 2,672,675
%%% lines, 205,289 entries, and 6,375
%%% <at>String{} abbreviations, occupying
%%% 94.8MB of disk space. This work updated 85
%%% existing entries and added 104 new entries,
%%% completing coverage to for all issues up to
%%% Volume 17, Number 5, September 1995.
%%%
%%% Numerous errors in the sources noted above
%%% have been corrected. Spelling has been
%%% verified with the UNIX spell and GNU ispell
%%% programs using the exception dictionary
%%% stored in the companion file with extension
%%% .sok.
%%%
%%% The ACM maintains Web pages with journal
%%% tables of contents for 1985--1995 at
%%% That data hasenter code here
%%% been automatically converted to BibTeX
%%% form, corrected for spelling and page
%%% number errors, and merged into this file.
%%%
%%% ACM copyrights explicitly permit abstracting
%%% with credit, so article abstracts, keywords,
%%% and subject classifications have been
%%% included in this bibliography wherever
%%% available. Article reviews have been
%%% omitted, until their copyright status has
%%% been clarified.
%%%
%%% bibsource keys in the bibliography entries
%%% below indicate the entry originally came
%%% from the computer science bibliography
%%% archive, even though it has likely since
%%% been corrected and updated.
%%%
%%% URL keys in the bibliography point to
%%% World Wide Web locations of additional
%%% information about the entry.
%%%
%%% BibTeX citation tags are uniformly chosen as
%%% name:year:abbrev, where name is the family
%%% name of the first author or editor, year is a
%%% 4-digit number, and abbrev is a 3-letter
%%% condensation of important title
%%% words. Citation tags were automatically
%%% generated by software developed for the
%%% BibNet Project.
%%%
%%% In this bibliography, entries are sorted in
%%% publication order, using bibsort -byvolume.
%%%
%%% The checksum field above contains a CRC-16
%%% checksum as the first value, followed by the
%%% equivalent of the standard UNIX wc (word
%%% count) utility output of lines, words, and
%%% characters. This is produced by Robert
%%% Solovay's checksum utility.",
%%% }
%%% ====================================================================
@Preamble{
"\hyphenation{
Fa-la-schi
Her-men-e-gil-do
Lu-ba-chev-sky
Pu-ru-sho-tha-man
Roe-ver
Ros-en-krantz
Ru-dolph
}" #
"\ifx \undefined \circled \def \circled #1{(#1)}\fi" #
"\ifx \undefined \reg \def \reg {\circled{R}}\fi"
}
%%% ====================================================================
%%% Acknowledgement abbreviations:
@String{ack-meo = "Melissa E. O'Neill,
School of Computing Science,
Simon Fraser University,
Burnaby, BC,
Canada V5A 1S6,@String{ack-nhfb = "Nelson H. F. Beebe,
University of Utah,
Department of Mathematics, 110 LCB,
155 S 1400 E RM 233,
Salt Lake City, UT 84112-0090, USA,
Tel: +1 801 581 5254,
FAX: +1 801 581 4148,@String{ack-pb = "Preston Briggs,
Tera Computer Company,
2815 Eastlake East,
Seattle, WA 98102,
USA,
Tel: +1 206 325-0800,
e-mail: \path|[email protected]|"}
%%% ====================================================================
%%% Journal abbreviations:
@String{j-TOPLAS = "ACM Transactions on Programming
Languages and Systems"}
%%% ====================================================================
%%% Bibliography entries:
Поэтому в основном мне нужно все, что находится ниже последней строки: «%%% Bibliography records:», и получить этот контент, чтобы я мог его проанализировать, используя мой анализатор BibTex.
Я был бы очень признателен, если бы кто-нибудь дал мне эффективный способ добиться этого.
Спасибо
Задача ещё не решена.
Других решений пока нет …