TL;DR: An algorithm is presented that identifies the predominant font in which the running text in an English language document is printed, and the repeated words in the document are utilized to overcome noise in the input.
TL;DR: A defect model is validated if the OCR errors induced by the model are indistinguishable from the errors encountered when using real scanned documents, as well as four measures to quantify this similarity.
Abstract: Considers the problem of evaluating character image generators that model distortions encountered in optical character recognition (OCR). While a number of such defect models have been proposed, the contention that they produce the desired result is typically argued in an ad hoc and informal way. The authors introduce a rigorous and more pragmatic definition of when a model is accurate: they say a defect model is validated if the OCR errors induced by the model are indistinguishable from the errors encountered when using real scanned documents. The authors describe four measures to quantify this similarity, and compare and contrast them using over ten million scanned and synthesized characters in three fonts. The measures differentiate effectively between different fonts and different scans of the same font regardless of the underlying text.
TL;DR: In this article, a method and system for allowing scalers to support multiple font formats in a graphics system that processes data having a specified font format is presented, which includes actively registering each of the scalers with a font scaler manager by specifying a primary font format and one or more secondary font formats that are supported by each scalers.
Abstract: A method and system for allowing scalers to support multiple font formats in a graphics system that processes data having a specified font format. The method and system includes actively registering each of the scalers with a font scaler manager by specifying a primary font format and one or more secondary font formats that are supported by each of the scalers. The font scaler manager then selects one of the scalers to process the data by finding a match between the specified font format and the primary font formats registered by the scalers. If a match is not found, then one of the scalers is selected by finding a match between the specified font format and the secondary font formats registered by the scalers.
TL;DR: In this article, an image font file is created containing compressed bitmap representations of the characters of one or more fonts utilized for a given text, which are derived as character image templates corresponding with a font of an enlarged size.
Abstract: An image font file is created containing compressed bitmap representations of the characters of one or more fonts utilized for a given text. These compressed bitmap representations are derived as character image templates corresponding with a font of an enlarged size. Upon being conveyed to client software, the individual characters of the image font file are accessed, and while remaining in a compressed format are selectively shifted in accordance with typesetting specification error values, then scaled and filtered to produce a display character in anti-aliased, sub-pixel position format.
TL;DR: In this article, the authors present a method and apparatus for rendering characters on one or more output devices using multiple media fonts; one type of a multiple media font being a multiple color font ("MCF") is a scalable font (a font used to render characters in multiple sizes and output device pixel resolutions).
Abstract: Method and apparatus for rendering characters on one or more output devices using multiple media fonts; one type of a multiple media font being a multiple color font ("MCF"). An MCF is a scalable font (a font used to render characters in multiple sizes and output device pixel resolutions) that enables the use of one or more types of color scheme (color design) data and character shape data in conjunction with one or more types of transformation data (shapes and visual looks) to render characters. Inventive multiple media fonts can be embodied for use: (a) in coordination with present operating systems; (b) as an annex to applications programs; (c) in coordination with, or as an integral part of, an operating-environment; and (d) over the Internet or other computer networks.
TL;DR: In this article, a document is examined to detect each font, and each glyph of a font appearing in the document, and if all of the detected fonts are currently stored in an imaging device, the document is sent to the device.
Abstract: When a document imaging operation commences, a document is examined to detect each font, and each glyph of a font, appearing in the document. If all of the detected fonts are currently stored in an imaging device, the document is sent to the device. If one or more detected fonts is not stored in the device, the detected glyphs of that font are mapped to a sparse font set. The device is queried to determine whether it can store the sparse font set. If it can, the sparse font set is downloaded to the device. If the device cannot store the sparse font set, the document is converted into a bit-mapped image, which is then imaged.
TL;DR: Digital typefaces for computer graphics and multimedia applications should be capable of supporting operations such as font variations, transformations, deformations and blending, but today's most advanced typeface representations support only geometric outline representations and basic font variations.
Abstract: Digital typefaces for computer graphics and multimedia applications should be capable of supporting operations such as font variations, transformations, deformations and blending. A powerful implementation of such operations must rely on the inherent typographic attributes of the typeface. However, even today's most advanced typeface representations support only geometric outline representations and basic font variations. In this paper we discuss high-level typeface representations which we term Parametric Typographic Representations (PTRs). We present an algorithm for automatically extracting typographic elements of typefaces from their outline representation, which is an essential initial step in converting typefaces from outline representations to PTRs. The extracted typographic elements include serifs, bars, stems, slants, bows, arcs, curve stems and curve bars. Most notable is the treatment of serifs, which are represented by finite-antomata. The algorithm only needs to learn a serif type once, and is then capable of automatically recognizing it in different typefaces. We show an application of a PTR for automatic high-quality hinting of fonts, which is one of the most important stages in digital font production. Our system was used to generate hints for dozens of thousands of Kanji, Roman and Hebrew characters.
TL;DR: A font substitution printer serving as a character output device includes a substitute font generation section, a storage section and other sections as mentioned in this paper, which includes a font attribute data base consisting of the font names, typeface names, encoding names and metrics names of loaded fonts and usable fonts.
Abstract: A font substitution printer serving as a character output device includes a substitute font generation section, a storage section and other sections. The storage section includes a font attribute data base consisting of the font names, typeface names, encoding names and metrics names of loaded fonts and usable fonts, a typeface attribute base, and a substitute encoding data base storing therein information as to a relationship between encodings and other encodings which substitute for the former encodings. The substitute font generation section extracts a typeface name, an encoding name and a metrics name from an output requested font name, selects a substitute typeface most similar in the typeface attribute to the extracted typeface name, and, out of substitute fonts consisting of one or more loaded fonts which satisfy not only the substitute encoding that is obtained in accordance with the extracted encoding name but also the selected substitute typeface, creates a substitute font with the weight or metrics thereof adjusted.
TL;DR: The font width cache as mentioned in this paper is designed for use in conjunction with the Unicode character set, which assigns phonetic-based characters to a lower subrange of character codes and ideographic characters to an upper subrange.
Abstract: A self-optimizing font width cache provides an efficient caching mechanism for providing font widths to an application program. The font width cache acts as a font width server that services requests for widths for any given character. The font width cache maintains good system performance by minimizing the number of calls to the operating system without requiring an inordinate amount of memory for the font width cache. The font width cache is designed for use in conjunction with the Unicode character set, which assigns phonetic-based characters to a lower subrange of character codes and ideographic characters to an upper subrange. Each font realization is associated with a font width cache, which include two hash tables. One hash table is associated with each subrange of character codes. Both hash tables start out small and grow dynamically in response to the demands placed on the hash table by the data being displayed. The decision to increase the size of a hash table depends on the percentage of hash table slots that are in use and the number of collisions that have occurred when trying to access that hash table.
TL;DR: In this article, a character data processing method and apparatus enabling registration of font data generated by an external device into an output device even if the data format created by the external device is different from that of the output device is presented.
Abstract: A character-data processing method and apparatus enabling registration of font data generated by an external device into an output device even if the data format created by the external device is different from that of the output device. When font data generated by a host computer is sent to a printer, whether or not a 4-byte font identification flag of the font data is effective is judged. If effective, the font data is registered as an effective external character font. If not effective, the endian of the data is converted by rearranging the order of byte-array of the identification flag, and whether or not the font is effective is judged. If effective, the font data is registered while performing endian conversion by rearranging the order of byte-array of font data itself. Thus, difference of endians between the host computer and the printer is offsetted. Regarding bitmap font data, endian conversion is performed such that the MSB side bit and the LSB side bit are exchanged. Thus, in a case where the endians of the host computer and the printer are different, the font data generated by the host computer can be used by the printer.
TL;DR: In this paper, a section of text in an electronic document is marked with a tag that designates a list of font faces for use in drawing the text at a computer which is remotely browsing the electronic document from a computer network.
Abstract: A section of text in an electronic document is marked with a tag that designates a list of font faces for use in drawing the text at a computer which is remotely browsing the electronic document from a computer network. A browser at the computer chooses a font face for drawing the text from the font faces installed on the computer according to the tag-designated list, such as by matching font face name or set of font face characteristics. The browser also can include an alternative font table which stores data identifying alternative font faces for many common font faces. The browser searches the installed font faces for the alternative font faces to those in the tag-designated list when a direct match to the tag-designated font faces is not found.
TL;DR: In this article, a system has been built that selects excerpts from a scanned document for presentation as a summary, without using character recognition, relying on the idea that the most significant sentences in a document contain words that are both specific to the document and have a relatively high frequency of occurrence within it.
TL;DR: In this article, the Stroked-based font is defined by a stroke representation displayable in high-resolution and low-resolution space, which includes defining a basic stroke with key points and width values as its primary parameters and feature points and curve features as the secondary parameters.
Abstract: A method and apparatus for producing a stroked-based font defined by a stroke representation displayable in high-resolution and low-resolution space. The stroke representation includes defining a basic stroke with key points and width values as its primary parameters and feature points and curve features as the secondary parameters. Hinting information for certain key points provide information for displaying quality strokes in low resolution space. A CAD tool allows a font designer to easily select the parameters for the design of basic strokes.
TL;DR: An intuitive and effective stroke extraction method that passes through the distorted region and gets the reliable information of global features by applying the trend-followed transcribing technique to correctly accomplish the tasks.
TL;DR: In this paper, a method and system for providing multiple typographic glyph data items to a requesting client from a font scaler sub-system is presented, which includes accepting a request from the client that describes multiple glyphs and a destination memory in which to store the glyphs.
Abstract: A method and system for providing multiple typographic glyph data items to a requesting client from a font scaler sub-system. The method and system includes accepting a request from the client that describes multiple glyphs and a destination memory in which to store the glyphs. From the request, a transaction message is formed and transmitted to a scaler server using an application program interface. The scaler server then generates the multiple glyph data items from the descriptions of the multiple glyphs, and stores the glyph data items directly into the destination memory.
TL;DR: In this paper, a method of selecting a primary font and a primary size for displaying text in an electronic book having a book-shaped housing is described, where a user-initiated event is received in which one word of the plurality of words (250) is selected.
Abstract: A method of selecting a primary font and a primary size for displaying text in an electronic book (118) having a book-shaped housing (100) includes displaying a plurality of words (250) using a corresponding plurality of combinations of a plurality of fonts and a plurality of sizes. A user-initiated event (212) is received in which one word of the plurality of words (250) is selected. The primary font is updated to a font with which the one word is displayed, and the primary size is updated to a size in which the one word is displayed.
TL;DR: The problem of segmenting the Uygur characters in various fonts and size in printed scripts is presented and the technique for the segmentation is presented as following: line separation, word separation, segmentsing the word into isolated characters.
Abstract: In many OCR systems, character segmentation is a necessary preprocessing step for character recognition. It is an important step because incorrectly segmented characters are not likely to be correctly recognized. The most difficult case in character segmentation is cursive scripts. Uygur character is a cursive script. This paper presents the problem of segmenting the Uygur characters in various fonts and size in printed scripts. The technique for the segmentation is presented as following: line separation, word separation, segmenting the word into isolated characters consists of the two step's algorithms, topological segmentation, and quasi-topological segmentation. Topological segmentation is based on tracing the outer contour of a given word. Quasi-topological segmentation is based on the decision to section a character on a combination of feature-extraction and character-width measurements. Our approach relies on the feature of characters and fonts and profile models.
TL;DR: In this paper, a data processing configuration includes a computer, a printer and a font memory which houses tables that define spacing metrics that are individual to each glyph included in a font, and the computer is connected to the font memory and further includes a printer driver function which is controllable to download glyph data to the printer in response to an output from an application running on the computer, which specifies a requirement that a glyph is to be printed by the printer.
Abstract: A data processing configuration includes a computer, a printer and a font memory which houses tables that define spacing metrics that are individual to each glyph included in a font. The computer is connected to the font memory and further includes a printer driver function which is controllable to download glyph data to the printer in response to an output from an application, running on the computer, which specifies a requirement that a glyph is to be printed by the printer. The computer is responsive to a download requirement to derive from the font memory, header data for transmission to the printer. The downloaded header data excludes the spacing metrics that are individual to each glyph of the font. Thereafter, the printer driver downloads required font glyph data by transmitting data structures to the printer which include, among other data elements, the spacing metrics that are individual to the specific glyph being transmitted. In such manner, spacing metrics are transmitted to the printer only with a particular associated glyph and only when that glyph is required at the printer, thereby reducing data transfer times and making more efficient use of printer memory.
TL;DR: In this paper, a character pattern generator includes a ROM for storing font data and attribute information of each point of each stroke forming the font data, and a CPU for recognizing a portion to be transformed of a contour shape of the stored font data.
Abstract: A character pattern generator includes a ROM for storing font data and attribute information of each point of each stroke forming the font data, and a CPU for recognizing a portion to be transformed of a contour shape of the stored font data based on the attribute information of each point. The CPU further adds to the recognized portion to be transformed, a control point for performing transformation into a character pattern having a specified typeface code. The CPU then calculates the coordinates of the control point based on a factor at each point of each stroke of the font data. The CPU then transforms the contour shape by using the added control point and the font data. Finally, the CPU generates the character pattern having the specified typeface code based on the contour shape of the transformed font data.
TL;DR: In this paper, an optical character recognition system identifies a font type for an image of a block of text by matching characters in a pre-defined character set located in the text block.
Abstract: An optical character recognition system identifies a font type for an image of a block of text. Key characters matching characters in a pre-defined character set are located in the text block. The image of the text block is partitioned into plural image segments where, for example, each image segment is an image of one line of text. Each image segment is evaluated to determine whether the characters in the image segment have fixed pitch or variable pitch. For each key character in the image segments designated as variable pitch, a determination is made whether a gap exists between a left edge of the key character image and the left border of the key character. A font type is identified for the characters in the image data based on the fixed pitch determinations and the gap determinations.
TL;DR: This article proposed a method for forming a font of a character or a symbol including inputting a plurality of coordinate data lying on a contour of a stroke specifying a dash of the character or the symbol.
Abstract: The present invention provides a method for forming a font of a character or a symbol including inputting a plurality of coordinate data lying on a contour of a stroke specifying a dash of the character or the symbol, preparing data of the stroke including a plurality of input coordinate data, having one of the input coordinate data as coordinate data specifying the beginning point of the stroke or as coordinate data specifying the end point of the stroke, and forming a font of the character or the symbol including the data of one or more of the strokes. The invention also includes a method of determining whether a portion of the font of the character or symbol is blurred or bled, and alters the font of the character or symbol accordingly. The determination of blurring or bleeding may be made based upon at least one of ink quantity, writing speed, paper construction, or other appropriate factors.
TL;DR: In this paper, a feature extracter for extracting a feature vector from an input character image, a recognition dictionary for storing a standard pattern of a character including font information, and an output unit for outputting a character code of the identified standard pattern and color information corresponding to the font kind.
Abstract: Character recognizing method and apparatus for recognizing a character in an input document image and outputting the document as a recognition result by a visible image. The apparatus comprises a feature extracter for extracting a feature vector from an input character image, a recognition dictionary for storing a standard pattern of a character including font information, and an output unit for outputting a character code of the identified standard pattern and color information corresponding to the font kind. The dictionary stores the standard pattern every font.
TL;DR: A pager for receiving a text message from a transmitter and displaying characters corresponding to the text in different fonts is described in this article, where the pager includes a first font read-only-memory having a plurality of predetermined addresses.
Abstract: A pager for receiving a text message from a transmitter and displaying characters corresponding thereto in different fonts. The pager includes a first font read-only-memory having a plurality of predetermined addresses, for storing a first font corresponding to said predetermined addresses; an n-th font read-only-memory having the same plurality of predetermined addresses, for storing an n-th font having the same code as the first font but a different display form from the first font, wherein n≧2; and a controller for checking whether a received code forming the text message is an n-th font conversion code by comparing the received code with a pre-selected n-th font conversion code, for accessing an actual address of the n-th font read-only-memory corresponding to the received code when the received code is the n-th font conversion code so as to read the corresponding font data for a visual display of the read font data.
TL;DR: An image recording system and method for smoothing printed characters, in particular for low-resolution printers, was proposed in this paper, which includes a font decoding device, a font generating device and a font sizing device.
Abstract: An image recording system and method for smoothing printed characters, in particular, for low resolution printers. The image recording system includes a font decoding device, a font generating device and a font sizing device. The font decoding device decodes first font data including first character size information and first luminance (optical density) information. The first luminance information has two gradation levels for printing image data. The font generating device generates second font data including second character size information, which is larger than the first character size, based on the first font data. The font sizing device reduces the font of the second font data to the size of the first character and at the same time converts the first luminance (optical density) information using spatial filters into multiple gradations to reduce printed character jaggedness. Alteratively, the image recording system can use a pattern recognition spatial filter on the first font data to convert the first luminance information into multiple gradations to reduce printed character jaggedness.
TL;DR: In this article, a method and apparatus are disclosed for using information about colors, fonts, and viewing distances to predict the legibility of a sign, taking into account the colors used in the sign, to allow the use of many different color combinations while still avoiding "strobing" from simultaneous contrast caused by complementary afterimages.
Abstract: A method and apparatus are disclosed for using information about colors, fonts, and viewing distances to predict the legibility of a sign. The invention takes into account the colors used in the sign, in order to allow the use of many different color combinations while still avoiding "strobing" from simultaneous contrast caused by complementary afterimages. A computer program accepts information such as the desired visual acuity, viewer velocity (for signs viewed from a car), font specification, ambient light strength, and the desired typeface color and background color. The program then predicts the legibility of a sign which uses the fonts and colors indicated under the given conditions. Thus, the program may be used to assess the effect of various changes on the sign's legibility without actually rendering or building the sign, installing it, and looking at it.
TL;DR: The Digital Typography Sourcebook as discussed by the authors is a collection of more than 400 fonts, including the hottest new designs from some of the world's leading designers, with the help of hundreds of before and after examples.
Abstract: From the Publisher:
Digital Typography Sourcebook is your ultimate font reference - it tells you how to identify different fonts, where to buy them and how much you should pay, what fonts to avoid, and how to create professional looking designs. You'll become familiar with over 400 fonts, including the hottest new designs from some of the world's leading designers. With the help of hundreds of before-and-after examples, expert Marvin Bryan gives you professional techniques for making the best use of digital typography in your work. He even tells the fascinating stories of the men and women who created many well-known typefaces and the circumstances that inspired them.
TL;DR: In this paper, a font data memory stores a plurality of message fonts each corresponding to a different calling number group, each consisting of at least one calling number, and the message font is searched for a message font based on the corresponding calling number to display the message with message font on a display.
Abstract: A selective calling receiver previously stores a predetermined number of calling numbers and selectively receives a message according to one of the calling numbers. The received message and its corresponding calling number are sequentially stored onto a message memory. A font data memory stores a plurality of message fonts each corresponding to a different calling number group. In the font data memory, the calling numbers are divided into a plurality of caller groups each consisting of at least one calling number. In response to the user's instruction, the font data memory is searched for a message font based on the corresponding calling number to display the message with the message font on a display.
TL;DR: In this paper, a font processing device performing line thinning and line thickening using an out-line font is described. But the line thickness of the font is not specified.
Abstract: PURPOSE: To provide a font processing device performing a line thinning processing, an ordinary processing and a line thickening processing in accordance with the resolution of a device while using an out-line font. CONSTITUTION: A raster processing part 22 generates a raster image into a first area 27 form an out-line font stored in a storage part 21. An out-line generating part 24 generates an outline image into a second area 28 while using an outline width determined by an outline width determining part 23. An operation part 25 performs a logical operation of the luster image and the outline image and then the luster image whose line is thinned or thickened is generated.
TL;DR: In this article, an electric mail from a computer communication network is printed on recording paper and when the number of lines for one mail exceeds a threshold, it is divided into plural pages and set to a font size in which it is easy to see.
Abstract: PROBLEM TO BE SOLVED: To make an electric mail easy to see and also to economize recording paper by making the size and font of characters different in accordance with the kind of information and an information amount based on a program when the electric mail from a computer communication network is printed on recording paper. SOLUTION: When an electric mail is received, a CPU 1 sets fonts such as Mincho typeface, italic, etc., to use for each printing in accordance with a character code, a quotation and the text based on a font table T1. When the mail is over, it counts the number of lines for one mail and sets the number of the maximum printable lines in a standard font size in accordance with the size of recording paper. When the number of lines for one mail exceeds a threshold, it is divided into plural pages and set to a font size in which it is easy to be seen. Thereby, the CPU 1 makes fonts different among a header part, a quotation and the text so that they may be clearly distinguished and that the mail may become easy to be seen, also changes the size of characters in accordance with the information amount and can efficiently use recording paper.
TL;DR: In this paper, the authors use color information and deformation on individual character fonts and diagrams on the basis of voice information correlated with the character string data to recognize voice information added to a character string.
Abstract: PURPOSE: To intuitively and easily recognize voice information added to a character string. CONSTITUTION: Character string data stored in an external storage device 107 and pronunciation information correlated with its character row are read in a memory 101 for work. A deformed font generating part 108 generates a font or diagram data to display and print previously read-in character string data by referring to a character font data base 109. Application of color information and deformation are performed on individual character fonts and diagrams on the basis of voice information correlated with the character string data. A character font or diagram data worked to express voice information is transferred to a video memory 102 to be displayed on a display device 103, and also to a printing buffer 104 to be printed by a printer 105. In this way, it can be visually and easily displayed and printed by a recognizable expressing means.