Submission to Dr. Stephen Hood, DSTO ESL, Information Technology Division
PROJECT PROPOSAL: Conceptual Schemata, Data Management and Intelligent Navigation of Large Scale Knowledge bases
P.W. Eklund and P.H. Martin School of Information Technology Griffith University Parklands Drive, Southport, PMB 50, Gold Coast Mail Centre. QLD 9726 Telephone: (07) 5594 8265 Facsimile: (07) 5594 8066 email addresses: p.eklund@mailbox.gu.edu.au and philippe.martin@mailbox.gu.edu.au
PREAMBLE
The project timetable is given as 1 year. This schedule represents both a continuation and revision of the scheduled agreed for the 1998/99 research agreement following discussions at the Gold Coast in January 1999 with Stephen Hood and others.The agreed budget for 1999/2000 is $80K, $50,758 + 26% on costs for the salary of Dr. P. Martin = $63,995, Equipment = 1 * SGI Visual PC workstation = $11,500 + the remainder as software/travel/books.
The agreement is subject to approval from the Deputy Vice Chancellor Research of Griffith University, Prof. Dennis Lincoln.
AIMS
The research agreement emphasizes: (i) the further refinement of WebKB (html description); (ii) the production of scientific publications resulting from it (these to publicize its use for data and knowledge management); (iii) the development of new and innovative applications of WekKB; (iv) the promotion and use of conceptual structures as a knowledge representation language for large scale knowledge modeling; (v) liaison with cognate research groups at the DSTC and relevant groups in United States.OBJECTIVES
Develop a set of web-accessible tools intended for:- construction and maintenance of knowledge bases using the conceptual graph formalism;
- intuitive notation called formalised English, directly translatable into conceptual graphs;
- the use of conceptual graphs for indexing and connecting text and non-text document elements;
- retrieval and merging of conceptual graphs and/or the elements they index;
- co-operative construction of knowledge bases by multiple users.
ACHIEVEMENTS
The objectives require tools that explore the idea of using a knowledge representation for information retrieval and merging. The framework will assist users create and browse both knowledge and the information sources the knowledge encodes.
A web-based architecture (Internet or Intranet) for access to the tools and knowledge sources simplifies the development of prototypical tools. For example, web-browsers may be reused and all information on the WWW exploited as a potential knowledge source. This has been the approach to this point, with WebKB demonstrating this open-system architecture. For generality, all tools are web-accessible and can therefore access any knowledge or raw information accessible on the web.
Naturally, web-accessible tools can access, but not modify, data on the disks of remote users. Thus, the objective of co-operative knowledge base construction is to develop a client/server architecture that maintains an index of knowledge created by multiple-users. The usual problems of maintaining such conglomerate knowledge stores are compounded by consistency in the case of a deductive store.
Up to this point, with DSTO sponsorship, we have developed the basic tools and software architecture to achieve all the above objectives except those related to multi-user knowledge bases.
Existing tools are:
- a re-usable top-level ontology for concept and relation types;
- a generic hierarchy browser (plus instantiated browsers for concept and relation type hierarchies) implemented in Javascript;
- a text-based conceptual graph editor implemented in Javascript;
- a graphic conceptual graph editor (WebKB-GE) implemented in Java;
- a Javascript tool to index document elements by conceptual graphs;
- a Javascript tool to connect document elements to conceptual relations;
- a CGI script implemented in C which allows
1) Unix style file handling on WWW-accessible documents (e.g. grep, awk, count, etc) and combination of commands via a script language;
2) construction and retrieval of conceptual graphs or document element indexed by conceptual graphs. The conceptual graph workbench CoGITo is exploited and extended.
DELIEVERY
WebKB is accessible via a Javascript enabled browser, i.e. IE 4.0/Netscape 4.0 (or upwards).TASK SCHEDULE
1999/2000
1. Implementation of a large-scale Knowledge-Based Management System (KBMS) on top of a Data-Based Management System (DBMS). The KBMS will allow users to store knowledge in the form of a semantic network, and to retrieve parts of this knowledge via conceptual queries (4 months).
Relational databases rely on a fixed number of tables. Similarly, most object-oriented databases and deductive databases rely on a fixed number of types or relations. Thus, they are not adequate or inefficient to store an always changing semantic network of concept, concept types, relations and relation types. Besides, the DBMS should support simultaneous accesses by many users. Therefore, we will choose a flexible and multi-user object manager such as Shore. Shore has the advantage of a C++ interface and is both mature and freely available for commercial applications.
The KBMS will allow users to maintain their own knowledge, or annotate and complement other user generated knowledge. As with the current WebKB, knowledge will be entered via conceptual graphs, formalised English, structured text or document elements connected by conceptual relations.
Currently, WebKB does not merge the CGs of the users into a single semantic network, and can only retrieve CGs that include or specialize a query CG. The KBMS will integrate the CGs of the users into a single semantic network (whenever it is logically and semantically correct to do so), and allow the retrieval of the parts that include or specialize a query CG. Thus, more answers will be given to a query.
2. Initialisation of the KB with top-level ontologies and the natural language ontology WordNet (1 month).
Such an ontology has several advantages:
- allows some semantic checks on user generated knowledge;
- frees the casual user from organising his/her own ontology since he has only has to add a few concept and relation types relevant to its domain.
- exploits queries on knowledge using natural language not explicitly defined used in a clients knowledge vocabulary;
- knowledge from different users will be comparable since it is organised into a shared natural language ontology.
3. Implementation of protocols and visualisation techniques to allow the cooperative building of the KB by multiple users (3 months).
Visualisation techniques (e.g. aliases and viewpoints) allow users to inter-operate with the system without being bothered by large amounts of knowledge. The protocols will help each user to complement existing knowledge (i.e. mainly the knowledge of other users) and solve inconsistencies that s/he or the system has detected. The originality of the protocols is that they do not have to negotiate with each user with which they do not agree. This point also ensures the scalability of the approach.
3. Documentation, articles, conferences, presentation and training (4 months)
Schedule: task 1 (4 months), task 2 (1 month), task 3 (3 months), task 4 (4 months). Total: 12 months.
