Recently, the expansion of the Internet has led to a
deluge of information on the Web, making it difficult for users
to locate efficiently needed information. To facilitate efficient
searching for information, research into technology that can
summarize the general outline of a text document is essential.
This is especially true on the Web, where information from
bulletin boards, blogs, and other sources is being used as
consumer generated media data. Hence, summarizing
technology that can accurately capture opinions, impressions,
and fields of discussion is necessary. However, research efforts
thus far have yet to yield satisfactory results. In this paper, we
propose a method for generating a summary document using
three types of attribute information acquired from the original
document: the field, associated terms, and by using attribute
grammars that combine these three attributes in document
generation, we establish a formal and efficient generation
technology. Experiments using information from 400 blogs
found that when including the field and sensibility attributes,
the summary accuracy rate, readability, and meaning integrity
are 88.7%, 85%, and 86%, respectively. In comparison with
traditional technologies, these three evaluation criteria are each
4% higher, thus demonstrating the effectiveness of this method.
Abdunabi Ubul : received his B. Sc. degree in economics and
Management information f rom Xinjiang University, China in
2004. He has received his M. Sc. degree from Department of
Economics, Faculty of Integrated Arts and Sciences, University
Of Tokushima, Japan in 2008. Received his Ph. D. degree from
Department of Information Science and Intelligent Systems.
University Of Tokushima, Japan in 2012. His research interests
include information retrieval, natural language processing and
document processing.
Hidekazu Kakei : received his B.Eng. and M.Eng. Degrees in
architecture from Nagoya University, Japan, in 1988 and 1990
respectively, and his Ph.D. in architecture in Kobe University in 2007.
Since 2003 he has been an Assoc. Prof. in the Institute of Socio-Arts
and Sciences, Tokushima University, Japan. His research interests
include applying ICT to spatial and environmental design. He is a
member of Architectural Institute of Japan and the Institute of
Electronics, Information and Communication Engineers.
Jun-ichi Aoe : received his B. Sc. and M. Sc. degrees in electronic
engineering from the University of Tokushima, Japan, in
1974 and 1976, respectively, and his Ph. D. degree in communication
engineering from the University of Osaka, Japan in 1980.
Since 1976 he has been with the University of Tokushima. He
is currently a Professor in the Department of Information Science&
Intelligent Systems, Tokushima University, Japan. His research
interests include design of an automatic selection method
of key search algorithms based on expert knowledge bases, natural
language processing, a shift-search strategy for interleaved
LR parsing, robust method for understanding NL interface
commands in an intelligent command interpreter, and trie compaction
a l g o r i t hms f o r l a r g e k e y s e t s . H e i s t h e e d i t o r
of the Computer Algorithm Series of the IEEE Computer Society
Blog Document
Field Association
Attributes
Grammar
Sensibility
In this paper, we have presented a method in which we
use field association words and Sensibility, to create
summary documents using attributes from the text
information, such as fields, keywords, and Sensibility,
which was taken from blogs. For the materials used to
generate summary documents, first we used field association words with the data acquired from the blog,
and determined the blog’s field. Then, we performed
Sensibility analysis of the emotions of the people that
appear in the contents of the blog and determined the
Sensibility. For the summary document, once all three
attributes were prepared, by using the attribute grammar,
we established a formal and efficient generation
technology.
[1] T. M. Chang, W. F. Hsiao, "A hybrid approach to
automatic text summarization", IEEE International
Conference, 2008, pp. 65–70.
[2] L.Hennig, W.Umbrath, R.Wetzker, "An ontology-based
approach to text summarization", IEEE/WIC/ACM
International Conference on Web Intelligence and
Intelligent Agent Technology, 2008, Vol. 3, pp. 291–294.
[3] S. F. Liang, S. Devlin, J. Tait, "Investigating sentence
weighting components for automatic summarization",
Information Processing & Management, 2007, Vol.43,
No.1, pp. 146–153.
[4] V. R. Uzeda, T. Pardo, M. Nunes, "Evaluation of
automatic text summarization methods based on
rhetorical structure theory", Eight International
Conference on Intelligent Systems Design and
Applications, 2008, Vol.2, pp. 389–394.
[5] A. Chongsuntornsri, O. Sornil, "An automatic Thai text
summarization using topic sensitive page rank",
International Symposium on Communications and
Information Technologies, 2006, pp. 547–552.
[6] G. Erkan, D. R. Radev, L.Rank, "graph-based lexical
centrality as salience in text summarization", J. Artif.
Intell. Res, 2004, pp. 457–479.
[7] H. Zha, "Generic summarization and key phrase
extraction using mutual reinforcement principle and
sentence clustering", In Proceedings of the 25th annual
international ACM SIGIR conference on research and
development in information retrieval, 2002, pp. 113–120.
[8] J.Y.Yeh, H.R.Ke, "Text summarization using a trainable
summarizer and latent semantic analysis", Information
Processing & Management, 2005, Vol.41, No.1, pp. 75–
95.
[9] L. H. Reeve, H. Han, "The use of domain-specific
concepts in biomedical text summarization", Information
Processing & Management, 2007, Vol.43, No.6,
pp.1765–1776.
[10] A.Ubul, El.Atlam, H. Kitagawa, M. Fuketa, K. Morita,
J. Aoe, "An Efficient Method of Summarizing
Documents Using Impression Measurements", An
Efficient Method of Summarizing Documents Using
Impression Measurements, 2013, Vol.32, No.2, pp.371-
391.
[11] H.J.Lee, S.Park, D.kim, "Automatic generic document
summarization based on non-negative matrix
factorization", Information Processing & Management,
2009, Vol.45, No.1, pp. 20-34.
[12] E.-S. Atlam, M. Fuketa, K. Morita and J. Aoe,
"Document similarity measurement using field
association term", Information Processing &
Management, 2003, Vol.39, No.6, pp.809–824.
[13] E.-S. Atlam, G. Elmarhomy, M. Fuketa, K. Morita and
J. Aoe, "Automatic building of new field association
word candidates using search engine", Information
Processing & Management, 2006, Vol.42, No.4,
pp.951–962.
[14] T. Yoshinari, E.-S. Atlam, M. Fuketa, K. Morita and J.
Aoe, "Automatic acquisition for sensibility knowledge
using co-occurrence relation", International Journal of
Computing and Technology, 2003, Vol.33,No.3,
pp.218–225.
[15] F. Neven,J. V. den Bussche, "Expressiveness of
structured document query languages based on attribute
grammars", JACM, 2002, Vol.49,No.1, pp. 56–100.
[16] Livedoor Blog, http://blog.livedoor.com/.
[17] Goo Blog, http://blog.goo.ne.jp/.
[18] BlogPeople, http://www.blogpeople.net/.
[19] Blogger, http://blogger.bz/index.shtml.
[20] C.Y.Lin, "ROUGE: A package for automatic evaluation
of summaries", In Proceedings of workshop on text
summarization branches out, post-conference workshop
of ACL, 2004.
[21] Y.Gong, X.Liu, "Generic text summarization using
relevance measure and latent semantic analysis", In
Proceedings of the 24th annual international ACM
SIGIR conference on research and development in
information retrival, 2001.pp.19–25.