Snobol4 语言 实战 文本中化学结构简式识别

Snobol4amuwap 发布于 4 天前 3 次阅读


Snobol4【1】 语言实战:文本中化学结构简式【2】识别

化学结构简式识别是化学信息学【3】中的一个重要任务,它涉及到从文本中提取化学分子结构【4】的信息。Snobol4 是一种古老的编程语言,以其简洁和强大的文本处理【5】能力而闻名。我们将使用 Snobol4 语言来实现一个简单的化学结构简式识别器,用于从文本中提取化学分子结构信息。

Snobol4 简介

Snobol4 是一种高级编程语言,由David J. Farber、Ralph E. Griswold 和 Ivan P. Polonsky 在1962年设计。它特别适合于文本处理任务,如文本搜索、替换和格式化。Snobol4 的语法简洁,易于理解,但同时也非常强大。

化学结构简式识别任务

化学结构简式识别任务的目标是从文本中提取化学分子结构信息。例如,从以下文本中提取化学结构简式:


The molecule C6H12O6 is a glucose.

在这个例子中,我们需要识别并提取出 `C6H12O6` 作为葡萄糖【6】的化学结构简式。

Snobol4 代码实现

以下是一个使用 Snobol4 实现的化学结构简式识别器的示例代码:

```snobol
:in
| 'The molecule ' | ' is a ' |
| 'C' | 'H' | 'O' | 'N' | 'P' | 'S' | 'B' | 'F' | 'Cl' | 'Br' | 'I' | 'At' |
| 'c' | 'h' | 'o' | 'n' | 'p' | 's' | 'b' | 'f' | 'cl' | 'br' | 'i' | 'at' |
| ' ' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '6' | 'H' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
| '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'O' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'C' | '6' |
| '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' | '6' | 'H' | '1' |
|