TL;DR: This work attempts to model the drawing process of fonts by building sequential generative models of vector graphics, which has the benefit of providing a scale-invariant representation for imagery whose latent representation may be systematically manipulated and exploited to perform style propagation.
Abstract: Dramatic advances in generative models have resulted in near photographic quality for artificially rendered faces, animals and other objects in the natural world. In spite of such advances, a higher level understanding of vision and imagery does not arise from exhaustively modeling an object, but instead identifying higher-level attributes that best summarize the aspects of an object. In this work we attempt to model the drawing process of fonts by building sequential generative models of vector graphics. This model has the benefit of providing a scale-invariant representation for imagery whose latent representation may be systematically manipulated and exploited to perform style propagation. We demonstrate these results on a large dataset of fonts crawled from the web and highlight how such a model captures the statistical dependencies and richness of this dataset. We envision that our model can find use as a tool for graphic designers to facilitate font design.
TL;DR: A structure-guided Chinese font generation system, SCFont, by using deep stacked networks to integrate the domain knowledge of Chinese characters with deep generative networks to ensure that high-quality glyphs with correct structures can be synthesized.
Abstract: Automatic generation of Chinese fonts that consist of large numbers of glyphs with complicated structures is now still a challenging and ongoing problem in areas of AI and Computer Graphics (CG). Traditional CG-based methods typically rely heavily on manual interventions, while recentlypopularized deep learning-based end-to-end approaches often obtain synthesis results with incorrect structures and/or serious artifacts. To address those problems, this paper proposes a structure-guided Chinese font generation system, SCFont, by using deep stacked networks. The key idea is to integrate the domain knowledge of Chinese characters with deep generative networks to ensure that high-quality glyphs with correct structures can be synthesized. More specifically, we first apply a CNN model to learn how to transfer the writing trajectories with separated strokes in the reference font style into those in the target style. Then, we train another CNN model learning how to recover shape details on the contour for synthesized writing trajectories. Experimental results validate the superiority of the proposed SCFont compared to the state of the art in both visual and quantitative assessments.
TL;DR: GlyphGAN as discussed by the authors is a style-consistent font generation method based on GANs, where the input vector for the generator network consists of two vectors: character class vector and style vector.
Abstract: In this paper, we propose GlyphGAN: style-consistent font generation based on generative adversarial networks (GANs). GANs are a framework for learning a generative model using a system of two neural networks competing with each other. One network generates synthetic images from random input vectors, and the other discriminates between synthetic and real images. The motivation of this study is to create new fonts using the GAN framework while maintaining style consistency over all characters. In GlyphGAN, the input vector for the generator network consists of two vectors: character class vector and style vector. The former is a one-hot vector and is associated with the character class of each sample image during training. The latter is a uniform random vector without supervised information. In this way, GlyphGAN can generate an infinite variety of fonts with the character and style independently controlled. Experimental results showed that fonts generated by GlyphGAN have style consistency and diversity different from the training images without losing their legibility.
TL;DR: This paper proposes a method to modify text in an image at character-level using two different neural network architectures - FANnet to achieve structural consistency with source font and Colornet to preserve source color.
Abstract: Textual information in a captured scene plays an important role in scene interpretation and decision making. Though there exist methods that can successfully detect and interpret complex text regions present in a scene, to the best of our knowledge, there is no significant prior work that aims to modify the textual information in an image. The ability to edit text directly on images has several advantages including error correction, text restoration and image reusability. In this paper, we propose a method to modify text in an image at character-level. We approach the problem in two stages. At first, the unobserved character (target) is generated from an observed character (source) being modified. We propose two different neural network architectures - (a) FANnet to achieve structural consistency with source font and (b) Colornet to preserve source color. Next, we replace the source character with the generated character maintaining both geometric and visual consistency with neighboring characters. Our method works as a unified platform for modifying text in images. We present the effectiveness of our method on COCO-Text and ICDAR datasets both qualitatively and quantitatively.
TL;DR: The current study supports the idea that metacognitive beliefs underlie font size effects in metamemory and reveals that people’s font size beliefs have some accuracy.
Abstract: Words printed in a larger 48-point font are judged to be more memorable than words printed in a smaller 18-point font, although font size does not affect actual memory. To clarify the basis of this font size effect on metamemory and memory, 4 experiments investigated how presenting words in 48 (Experiment 1) or 4 (Experiments 2 to 4) font sizes between 6 point and 500 point affected judgments of learning (JOLs) and recall performance. Response times in lexical decision tasks were used to measure perceptual fluency. In all experiments, perceptual fluency was lower for words presented in very small and very large font sizes than for words presented in intermediate font sizes. In contrast, JOLs increased monotonically with font size, even beyond the point where a large font impaired perceptual fluency. Assessments of people's metacognitive beliefs about font size revealed that the monotonic increase in JOLs was not due to beliefs masking perceptual fluency effects (Experiment 3). Also, JOLs still increased across the whole range of font sizes when perceptual fluency was made salient at study (Experiment 4). In all experiments but Experiment 4, recall performance increased with increasing font size, although to a lesser extent than JOLs. Overall, the current study supports the idea that metacognitive beliefs underlie font size effects in metamemory. As important, it reveals that people's font size beliefs have some accuracy. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
TL;DR: This approach can effectively improve the efficiency of font generation, reduce the costs of designers, and is able to inherit the style of existing fonts.
Abstract: With the rapid growth of multimedia information, the font library has become a part of people’s work life. Compared to the Western alphabet language, it is difficult to create new font due to huge quantity and complex shape. At present, most of the researches on automatic generation of fonts use traditional methods requiring a large number of rules and parameters set by experts, which are not widely adopted. This paper divides Chinese characters into strokes and generates new font strokes by fusing the styles of two existing font strokes and assembling them into new fonts. This approach can effectively improve the efficiency of font generation, reduce the costs of designers, and is able to inherit the style of existing fonts. In the process of learning to generate new fonts, the popular of deep learning areas, Generative Adversarial Nets has been used. Compared with the traditional method, it can generate higher quality fonts without well-designed and complex loss function.
TL;DR: This paper investigates the characteristics of fonts used in Japan considering the HMD swing that results from walking, from the viewpoints of readability (text readability) and legibility (ease of letter recognition) using six different fonts.
Abstract: In wearable computing environments, users acquire visual information in various scenes using a head mounted display (HMD). However, this induces problems due to their differences from conventional displays such as smartphone and e-books. In this research, we focused on the problem of vertical shock caused by walking. This problem interferes with seeing information on HMDs. In this paper, we discuss selection of font shapes to minimize the effect of this problem. If we can clarify the characteristics of the readability of fonts in wearable computing environments, the application designers can select fonts that strike a balance between intended design elements and readability. In this paper, we first investigated the characteristics of fonts used in Japan considering the HMD swing that results from walking, from the viewpoints of readability (text readability) and legibility (ease of letter recognition) using six different fonts. From this evaluation, we find that fonts with very thin horizontal lines and with very thin horizontal and vertical lines should not be presented on HMDs.
TL;DR: A novel framework that helps users to explore a font dataset using the multimodal method that provides unexpected but useful font images or concept words in response to the user's input and observes that the model produces highly promising results.
Abstract: When searching for suitable fonts for a digital graphic, users usually start with an ambiguous thought. For example, they would look for fonts that are suitable for a personal web page or party invitations for children. Their design concept becomes clearer as they interact with external interventions such as exposure to suitable images for use in their web page or the children's preferences regarding the party. Hence, it is important to support users' interactions with unexpected but useful concepts during their search. In this paper, we present a novel framework that helps users to explore a font dataset using the multimodal method that provides unexpected but useful font images or concept words in response to the user's input. We collect a large font dataset and the associated tags and propose the use of unsupervised generative model that jointly learns the correlation between the visual features of a font and the associated tags for the creative process. By examining the results of the model that change with various inputs, we observed that the model produces highly promising results. In the experiment, we verified that the generated concepts by the model are not only new but also relevant to the user input that appears to be useful for inspiring users.
TL;DR: A new, public dataset for the recognition of font groups in early printed books is introduced, and several state-of-the-art CNNs for the font group recognition task are evaluated.
Abstract: Based on contemporary scripts, early printers developed a large variety of different fonts. While fonts may slightly differ from one printer to another, they can be divided into font groups, such as Textura, Antiqua, or Fraktur. The recognition of font groups is important for computer scientists to select adequate OCR models, and of high interest to humanities scholars studying early printed books and the history of fonts. In this paper, we introduce a new, public dataset for the recognition of font groups in early printed books, and evaluate several state-of-the-art CNNs for the font group recognition task. The dataset consists of more than 35 600 page images, each page showing up to five different font groups, of which ten are considered in this dataset.
TL;DR: This paper proposes a novel generative feature learning algorithm that leverages the unique characteristics of fonts and designs an integrated rendering and learning process so that the visual feature from one image can be used to reconstruct another image with different text.
Abstract: Font selection is one of the most important steps in a design workflow. Traditional methods rely on ordered lists which require significant domain knowledge and are often difficult to use even for trained professionals. In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively. We collect a large-scale font tagging dataset of high-quality professional fonts. The dataset contains nearly 20,000 fonts, 2,000 tags, and hundreds of thousands of font-tag relations. We propose a novel generative feature learning algorithm that leverages the unique characteristics of fonts. The key idea is that font images are synthetic and can therefore be controlled by the learning algorithm. We design an integrated rendering and learning process so that the visual feature from one image can be used to reconstruct another image with different text. The resulting feature captures important font design details while is robust to nuisance factors such as text. We propose a novel attention mechanism to re-weight the visual feature for joint visual-text modeling. We combine the feature and the attention mechanism in a novel recognition-retrieval model. Experimental results show that our method significantly outperforms the state-of-the-art for the important problem of large-scale tag-based font retrieval.
TL;DR: The presented techniques following a deep learning architecture are equally suitable for the development of Arabic cursive scene text recognition systems, and the issues pertaining to text localization and feature extraction are presented.
Abstract: This paper presents a comprehensive survey on Arabic cursive scene text recognition. The recent years’ publications in this field have witnessed the interest shift of document image analysis researchers from recognition of optical characters to recognition of characters appearing in natural images. Scene text recognition is a challenging problem due to the text having variations in font styles, size, alignment, orientation, reflection, illumination change, blurriness and complex background. Among cursive scripts, Arabic scene text recognition is contemplated as a more challenging problem due to joined writing, same character variations, a large number of ligatures, the number of baselines, etc. Surveys on the Latin and Chinese script-based scene text recognition system can be found, but the Arabic like scene text recognition problem is yet to be addressed in detail. In this manuscript, a description is provided to highlight some of the latest techniques presented for text classification. The presented techniques following a deep learning architecture are equally suitable for the development of Arabic cursive scene text recognition systems. The issues pertaining to text localization and feature extraction are also presented. Moreover, this article emphasizes the importance of having benchmark cursive scene text dataset. Based on the discussion, future directions are outlined, some of which may provide insight about cursive scene text to researchers.
TL;DR: This manuscript investigates a new model which is concentrated on fusion strategy for recognition of character in Arabic handwritten script and results indicate that fusion strategies bring results better than traditional methods.
Abstract: Arabic is the one of the fifth spoken language across the globe (422 million speak Arabic). Accurate recognition of Arabic characters is a challenging work because of two reasons; firstly, major dissimilarities and minor similarities with other languages and secondly right to left writing style. Hence, good quality features are to be extracted and better classifiers are required to develop the optical character recognition of Arabic characters/font. This manuscript explores automatic and appropriate selection of classifiers and features sets for recognition of character in Arabic handwritten script. After analyzing the problem, this manuscript investigates a new model which is concentrated on fusion strategy. To achieve this objective, fusion of various classifiers and features has been done. The suggested work is on multi-font script recognition that contains fonts of SH Roqa, Naskh, Farsi, and Igaza. To perform experimentation, different classifiers and features which are appropriate to the script recognition for Arabic font nature have been considered and implemented. The suggested work has been executed on AHCD and IFN/ENIT datasets. The experimentation results indicate that fusion strategies bring results better than traditional methods.
TL;DR: The authors’ model is able to generate multiple fonts at once through font style-specifying mechanism and it can generate a new font at the same time if the authors combine the characteristics of existing fonts.
Abstract: Owing to the complex structure of Chinese characters and the huge number of Chinese characters, it is very challenging and time consuming for artists to design a new font of Chinese characters. Therefore, the generation of Chinese characters and the transformation of font styles have become research hotspots. At present, most of the models on Chinese character transformation cannot generate multiple fonts, and they are not doing well in faking fonts. In this article, the authors propose a novel method of Chinese character fonts transformation and generation based on generative adversarial networks. The authors’ model is able to generate multiple fonts at once through font style-specifying mechanism and it can generate a new font at the same time if the authors combine the characteristics of existing fonts.
TL;DR: In this paper, the drawing process of fonts is modeled by building sequential generative models of vector graphics, which have the benefit of providing a scale-invariant representation for imagery whose latent representation may be systematically manipulated and exploited to perform style propagation.
Abstract: Dramatic advances in generative models have resulted in near photographic quality for artificially rendered faces, animals and other objects in the natural world. In spite of such advances, a higher level understanding of vision and imagery does not arise from exhaustively modeling an object, but instead identifying higher-level attributes that best summarize the aspects of an object. In this work we attempt to model the drawing process of fonts by building sequential generative models of vector graphics. This model has the benefit of providing a scale-invariant representation for imagery whose latent representation may be systematically manipulated and exploited to perform style propagation. We demonstrate these results on a large dataset of fonts crawled from the web and highlight how such a model captures the statistical dependencies and richness of this dataset. We envision that our model can find use as a tool for graphic designers to facilitate font design.
TL;DR: In this paper, the applicability of convolutional neural networks (CNNs) for detecting the conformance of the fonts used with the ones, corresponding to the government standards, was studied.
Abstract: In this paper, we consider the problem of detecting counterfeit identity documents in images captured with smartphones. As the number of documents contain special fonts, we study the applicability of convolutional neural networks (CNNs) for detection of the conformance of the fonts used with the ones, corresponding to the government standards. Here, we use multi-task learning to differentiate samples by both fonts and characters and compare the resulting classifier with its analogue trained for binary font classification. We train neural networks for authenticity estimation of the fonts used in machine-readable zones and ID numbers of the Russian national passport and test them on samples of individual characters acquired from 3238 images of the Russian national passport. Our results show that the usage of multi-task learning increases sensitivity and specificity of the classifier. Moreover, the resulting CNNs demonstrate high generalization ability as they correctly classify fonts which were not present in the training set. We conclude that the proposed method is sufficient for authentication of the fonts and can be used as a part of the forgery detection system for images acquired with a smartphone camera.
TL;DR: In this article, the authors present the results of the study of a political cartoon in Arabic and French languages, which is characterized by the use of semiotic codes, for example, colour and kinesics; it is accompanied by paragraphemic means, which are font variations that go beyond the usage of punctuation marks in the standard language, and topographic means, representing various flat layouts of text.
Abstract: The article presents the material and the results of the study of a political cartoon in the Arabic and French languages. The relevance of this work is due to the description of the precedence in a polycode text in a comparative aspect due to the lack of a sufficient number of scientific papers affecting this issue. The authors offer an overview of the main stages of the study of texts with iconic and verbal components and different types of the component links; and history of the appearance of the terms that nominate this type of text. The text in our research is characterized by the use of semiotic codes, for example, colour and kinesics; it is accompanied by paragraphemic means, which are font variations that go beyond the use of punctuation marks in the standard language, and topographic means, representing various flat layouts of text. The implementation of the described type of text becomes a political discourse, the study of which also relates to the actual research topics of modern linguistics. It should be noted that a political cartoon is always a reflection of the opinion of society or an individual’s reaction to a significant public event, and it is its universal feature, which makes it possible to compare the means and categories of a creolized text of political cartoon in different linguistic cultures/in different languages. The object of the analysis is the creolized text of a political cartoon; the subject of the research is the category of precedence and its features aimed at the realization of the author’s intention in the political cartoon in Arabic and French. An important systemic characteristic of a creolized text is the category of precedence, which in this article corresponds to the category of intertextuality. The authors give examples to examine the use of precedent information at the level of the text and the image and their interconnection. The thorough analysis of the cartoons demonstrates the possibility of decoding a precedent sign in accordance with the type of speech culture of a native speaker. Precedence in a political cartoon can be expressed by a textual or graphic representation of universal human precedent phenomena, civilizational precedent phenomena, onyms and events of a supraregional nature. The formal expression of precedence can also be a symbol. In the conclusion, the article proposes a summary of the results of the study.
TL;DR: The overall pattern of results across the study suggests that an alternative response set, regardless of whether it belonged to a co-actor or to a non-social no-go condition, evoked equal amounts of interference comparable to those of the own response set.
Abstract: People working together on a task must often represent the goals and salient items of their partner. The aim of the present study was to study the influence of joint task representations in an interference task in which the congruency relies on semantic identity. If task representations are shared between partners in a joint Stroop task (co-representation account), we hypothesized that items in the response set of one partner might influence performance of the other. In Experiment 1, pairs of participants sat side by side. Each participant was instructed to press one of two buttons to indicate which of two colors assigned to them was present, ignoring the text and responding only to the pixel color. There were three types of incongruent distractor words: names of colors from their own response set, names of colors from the other partner’s response set, and neutral words for colors not used as font colors. The results of Experiment 1 showed that when people were doing this task together, distractor words from the partner’s response set interfered more than neutral words and just as much as the words from their own response color set. However, in three follow-up experiments (Experiments 2a, 2b, and 2c), we found an elevated interference for the other response-set words even though no co-actor was present. The overall pattern of results across our study suggests that an alternative response set, regardless of whether it belonged to a co-actor or to a non-social no-go condition, evoked equal amounts of interference comparable to those of the own response set. Our findings are in line with a theory of common coding, in which all events—irrespective of their social nature—are represented and can influence behavior.
TL;DR: An affect-aware font and color palette selection methodology is presented that allows users to specify a desired affect and recommends congruent fonts and color palettes for the word cloud.
Abstract: Word clouds are widely used for non-analytic purposes, such as introducing a topic to students, or creating a gift with personally meaningful text. Surveys show that users prefer tools that yield word clouds with a stronger emotional impact. Fonts and color palettes are powerful paralinguistic signals that may determine this impact, but, typically, the expectation is that they are chosen by the users. We present an affect-aware font and color palette selection methodology that aims to facilitate more informed choices. We induce associations of fonts with a set of eight affects, and evaluate the resulting data in a series of user studies both on individual words as well as in word clouds. Relying on a recent study to procure affective color palettes, we carry out a similar user study to understand the impact of color choices on word clouds. Our findings suggest that both fonts and color palettes are powerful tools contributing to the affect associated with a word cloud. The experiments further confirm that the novel datasets we propose are successful in enabling this. Based on this data, we implement a prototype that allows users to specify a desired affect and recommends congruent fonts and color palettes for the word cloud.
TL;DR: The AIP Proceedings article template has many predefined paragraph styles for you to use/apply as you write your paper, but each paper must include an abstract.
Abstract: The AIP Proceedings article template has many predefined paragraph styles for you to use/apply as you write your paper. To format your abstract, use the Microsoft Word template style: Abstract. Each paper must include an abstract. Begin the abstract with the word “Abstract” followed by a period in bold font, and then continue with a normal 9 point font.The AIP Proceedings article template has many predefined paragraph styles for you to use/apply as you write your paper. To format your abstract, use the Microsoft Word template style: Abstract. Each paper must include an abstract. Begin the abstract with the word “Abstract” followed by a period in bold font, and then continue with a normal 9 point font.
TL;DR: It is concluded that font readability can influence reading of easy and more difficult poems differentially, with strongest effects for easy poems.
Abstract: Previous research shows conflicting findings for the effect of font readability on comprehension and memory for language. It has been found that—perhaps counterintuitively–a hard to read font can be beneficial for language comprehension, especially for difficult language. Here we test how font readability influences the subjective experience of poetry reading. In three experiments we tested the influence of poem difficulty and font readability on the subjective experience of poems. We specifically predicted that font readability would have opposite effects on the subjective experience of easy versus difficult poems. Participants read poems which could be more or less difficult in terms of conceptual or structural aspects, and which were presented in a font that was either easy or more difficult to read. Participants read existing poems and subsequently rated their subjective experience (measured through four dependent variables: overall liking, perceived flow of the poem, perceived topic clarity, and perceived structure). In line with previous literature we observed a Poem Difficulty x Font Readability interaction effect for subjective measures of poetry reading. We found that participants rated easy poems as nicer when presented in an easy to read font, as compared to when presented in a hard to read font. Despite the presence of the interaction effect, we did not observe the predicted opposite effect for more difficult poems. We conclude that font readability can influence reading of easy and more difficult poems differentially, with strongest effects for easy poems.
TL;DR: It appears that readers rely on global typographic properties of the text in order to maintain an optimal number of characters to the left of their first fixation on a new line.
Abstract: Reading saccades that occur within a single line of text are guided by the size of letters. However, readers occasionally need to make longer saccades (known as return-sweeps) that take their eyes from the end of one line of text to the beginning of the next. In this study, we tested whether return-sweep saccades are also guided by font size information and whether this guidance depends on visual acuity of the return-sweep target area. To do this, we manipulated the font size of letters (0.29 vs 0.39 deg. per character) and the length of the first line of text (16 vs 26 deg.). The larger font resulted in return-sweeps that landed further to the right of the line start and in a reduction of under-sweeps compared to the smaller font. This suggests that font size information is used when programming return-sweeps. Return-sweeps in the longer line condition landed further to the right of the line start and the proportion of under-sweeps increased compared to the short line condition. This likely reflects an increase in saccadic undershoot error with the increase in intended saccade size. Critically, there was no interaction between font size and line length. This suggests that when programming return-sweeps, the use of font size information does not depend on visual acuity at the saccade target. Instead, it appears that readers rely on global typographic properties of the text in order to maintain an optimal number of characters to the left of their first fixation on a new line.
TL;DR: This research focused on the development of a new method of Latin-to-Balinese script transliteration that can be used for a web/mobile learning application to give transliterations knowledge as one aspect of Balinese script writing.
Abstract: As one of Balinese cultural richness, Balinese script writing is going to extinct because of its decreasing use. This research is one of the ways to preserve Balinese script writing using technological approach. Through collaboration between Software Engineering and Language discipline, this research focused on the development of a new method of Latin-to-Balinese script transliteration that can be used for a web/mobile learning application to give transliteration knowledge as one aspect of Balinese script writing. In this research area, this method utilized Bali Simbar font and was developed based on identified seventeen kinds of special words and three stages of string pattern process. Model-View-Controller (MVC) architectural pattern was used by this method, where each was implemented by using dictionary data structure (as a repository for words belong to seventeen kinds of special words), HTML (as a web/mobile application User Interface), and JavaScript (as controller between Model and View, where transliteration algorithm was written in here). The novelty of this method came from the accommodation of knowledge belong to seventeen kinds of special words through dictionary data structure look-up mechanism. Through the experiment, this method has passed over 98% (148 of 151) testing cases of The Balinese Alphabet writing rules and examples document. This result outperformed the best result of known existing method based on Bali Simbar font, i.e. Transliterasi Aksara Bali, that only has accuracy over 68% (103 of 151) cases of the same testing document.
TL;DR: This multi-disciplines collaboration research produced the first known method utilizing the Noto Sans Balinese font that can be elaborated into the establishment of the MVC design pattern and its implementation; the identification of 17 types of particular words; and the accuracy improvement compared to other methods utilizing Bali Simbar font.
Abstract: This research aims to provide a new method of Latin-to-Balinese script transliteration based on the Noto Sans Balinese font since so far no research provides its technological state of the art. This multi-disciplines collaboration research produced the first known method utilizing the Noto Sans Balinese font that can be elaborated into: 1) the establishment of the MVC design pattern and its implementation; 2) the identification of 17 types of particular words; 3) the accomodation of those particular words by the dictionary data structure; 4) the establishment of the accuracy measurement for the next new method development; and 5) the accuracy improvement compared to other methods utilizing Bali Simbar font. This method can be utilized as a core of learning application to provide the knowledge of transliteration, as a part of Balinese Language learning program in Indonesia, especially in Bali Province where its Balinese Language was considered as a mandatory subject from elementary to high school. This method used MVC architectural pattern, where Model handles a repository for the particular words, View handles application User Interface, and Controller handles transliteration algorithm based on string pattern matching. Through the experiment, accuracy above 91% (138 of 151 cases) has been achieved by this method on the testing cases of The Balinese Alphabet document. On the same testing cases, that result outperformed the existing method Transliterasi Aksara Bali based on the Bali Simbar font, with its accuracy a little above 68% (103 of 151 cases). In the future work, this method could be enhanced by: 1) accommodating remain and future rules and examples inside and outside the testing document that recently cannot be handled or gave incorrect result; and 2) enriching the particular words repository.
TL;DR: In this article, a multi-task learning framework is employed to jointly improve font classification and remove negative side effects caused by intra-class variances of glyph content, which can be used to generate improved font classifications.
Abstract: The present disclosure relates to a font recognition system that employs a multi-task learning framework to jointly improve font classification and remove negative side effects caused by intra-class variances of glyph content. For example, in one or more embodiments, the font recognition system can jointly train a font recognition neural network using a font classification loss model and triplet loss model to generate a deep learning neural network that provides improved font classifications. In addition, the font recognition system can employ the trained font recognition neural network to efficiently recognize fonts within input images as well as provide other suggested fonts.
TL;DR: A stroke recovery method based on deep convolutional neural network model for sequence recovery of static image strokes that performs robustly and competitively among multi-writer handwriting DOR tasks.
Abstract: Humans have the ability to recover the order from static handwritten images, after a large amount of data training, the machine may learn some patterns in the training data to imitate or learn a certain skill similar to humans. To overcome the problem of sequence recovery of static image strokes, this paper proposes a stroke recovery method based on deep convolutional neural network model. In the model training phase, by using the two-dimensional static handwritten image, the process of writing a font is convert into three channels includes strokes that have been written, possible positions of next strokes, and the completed font, and state of the input sample are quantified. In the recovery phase, the restored font is preprocessed to obtain the stroke segments of the font, and the trained model is used to evaluate the sequential combination of different stroke segments, so as to obtain the correct stroke order. With no more than one hundred of characters’ writing experiences, the proposed method performs robustly and competitively among multi-writer handwriting DOR tasks.
TL;DR: A novel model named FontGAN is proposed, which integrates the character stylization and de-stylization into a unified framework and facilitates more precise control of these two types of variables, thereby improving the quality of the generated results.
Abstract: Chinese character synthesis involves two related aspects, i.e., style maintenance and content consistency. Although some methods have achieved remarkable success in synthesizing a character with specified style from standard font, how to map characters to a specified style domain without losing their identifiability remains very challenging. In this paper, we propose a novel model named FontGAN, which integrates the character stylization and de-stylization into a unified framework. In our model, we decouple character images into style representation and content representation, which facilitates more precise control of these two types of variables, thereby improving the quality of the generated results. We also introduce two modules, namely, font consistency module (FCM) and content prior module (CPM). FCM exploits a category guided Kullback-Leibler loss to embedding the style representation into different Gaussian distributions. It constrains the characters of the same font in the training set globally. On the other hand, it enables our model to obtain style variables through sampling in testing phase. CPM provides content prior for the model to guide the content encoding process and alleviates the problem of stroke deficiency during de-stylization. Extensive experimental results on character stylization and de-stylization have demonstrated the effectiveness of our method.
TL;DR: This work presents an algorithmic system capable of generating glyphs from typographical skeletons, and explores the relation between legibility and coherence in an innovative type design approach where the generated typefaces can be applied in the most diverse media.
Abstract: It is through typography that order and form, visible and durable, is given to written communication. Typography has been accompanied by new technologies, which designers and typographers have adapted. The use of these technologies in the design process boosted the exploration of new approaches for type design. Designers began to explore generative processes in their design process in different types of projects, such as dynamic visual identities or generative typography. In this work, we present an algorithmic system capable of generating glyphs from typographical skeletons. This system, which is online at cdv.dei.uc.pt/2019 /letterspecies, fills a typographical skeleton using a drawing technique, both selected by the user, and outputs an OpenType font. With the presented system, we explore the relation between legibility and coherence in an innovative type design approach where the generated typefaces can be applied in the most diverse media.
TL;DR: Steganography, the art of hidden data protection, and the importance of imperceptibility in research.
Abstract: Data protection has become a more critical issue and the necessity to secure a transmission channel is become more serious. Therefore, steganography, the art of hidden data into a digital media in a way that embed a secret message in the cover document without permitting anyone to suspect the data existence except the intended recipient, has become a relevant topic of research. The actual challenge in steganography is how it could obtain high robustness and capacity without damaging the cover document imperceptibility. This article presents two steganography approaches that based on the Similarity of English Font Styles (SEFS). This process has the main document font style replaced by a similar font style to embed the secret message after encoding it. This is done by using 1) the upper-case letters and punctuation marks of the carrier document or 2) the white space between words, start and end letters of each word that has more than 2 letters in the carrier document. These approaches are tested by being applied to various document formats with various font styles. From the findings, the secret message was vague to an antagonist and the stego-document size was increased and the capacity is very high. Also, the approaches are implemented using C# to develop a tool that hides a critical data in text document and the same findings were achieved.
TL;DR: This work proposes a method to design text by a font fusion algorithm with arbitrary existing fonts, so that users can change the font type freely by indicating a point on the font map.
Abstract: Comics consist of frames, drawn images, speech balloons, text, and so on. In this work, we focus on the difficulty of designing the text used for the narration and quotes of characters. In order to support creators in their text design, we propose a method to design text by a font fusion algorithm with arbitrary existing fonts. In this method, users can change the font type freely by indicating a point on the font map. We implement a prototype system and discuss its effectiveness.