手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

一、前言

有時候在讀電子文件的過程中,往往會遇到圖片形式的文字,想要複製下來,記個筆記甚是不便,需要對照著打字輸入,活生生被逼成鍵盤俠啊……

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

被逼無奈,何不自己造個輪子,開發一款自己專屬的文字識別工具呢,於是我們找到了Matlab App Designer。

玩過 Matlab 的朋友們都知道,構建圖形使用者介面,Matlab提供了兩種工具,一是用guide構建,俗稱GUI,在未來版本中會移除;二是用App Designer,俗稱App,這是官方推薦的,也是以後主流的框架。

今天我們就透過一個簡單案例來介紹如何利用App設計一個圖片文字識別工具。

搭建的方式主要有兩種:

App設計器:靈活、方便、簡單,現代化方法;

基於uifigure的程式設計方式:靈活、重構方便,適合構建複雜、大型的圖形使用者介面,原始社會方法。

這裡我們就以程式設計方式進行建立。

二、預備

1。 API介面

文字識別涉及到光學字元識別(Optical Character Recognition,OCR)技術,如果我們自己造這種底層的輪子,要有高精度的識別率,那估計累得夠嗆。

幸運的是市場上已經有成熟的工具了,如百度智慧雲、阿里雲、科大訊飛等均提供了API介面,只需借過來用就完事。這裡主要以百度智慧雲提供的文字識別API為例。

免費申請文字識別功能後,在控制檯可以檢視到API Key和Secret Key,由這兩個引數可以獲得access_token,它是呼叫API介面的必需引數(如下圖紅色方框所示)。

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

透過檢視文字識別的技術文件,我們可以得到通用文字識別(標準版)的請求介面,如下:

HTTP 方法

:POST

請求URL

: https://aip。baidubce。com/rest/2。0/ocr/v1/general_basic

URL引數

:屬性名:access_token,值:透過API Key和Secret Key獲取的access_token,參考“Access Token獲取”

Header

:屬性名:Content-Type,值:application/x-www-form-urlencoded

請求引數

:屬性名:image,值:影象資料,base64編碼後進行urlencode,要求base64編碼和urlencode後大小不超過4M,最短邊至少15px,最長邊最大4096px,支援jpg/jpeg/png/bmp格式

返回引數

:屬性名:words_result,值:識別結果陣列

關於具體的HTTP請求過程接下來會細聊。

2。 影象的Base64編碼

Base64是網路上最常見的用於傳輸8Bit位元組碼的編碼方式之一,它是包括小寫字母a-z、大寫字母A-Z、數字0-9、符號+、/共64個字元的字符集,等號=用來作為字尾用途。任何符號都可以轉換成這個字符集中的字元,該轉換過程就叫做Base64編碼。Base64編碼具有不可讀性,需要解碼後才能閱讀。

許多程式語言都提供了現成的Base64編碼庫函式,Matlab也不例外,大家不妨 help matlab。net。base64encode檢視細節。

下面提供三種Matlab中的實現方式:

Java類——-org。apache。commons。codec。binary。Base64 和 matlab。net。base64encode

function base64string = img2base64(fileName) %IMG2BASE64 Coding an image to base64 file % INPUTS: % fileName string, an image file name % OUTPUTS: % base64string string, the input image‘s base64 code % USAGE: % >>base64string = img2base64(’1。jpg‘) % >>base64string = ’xxx‘ % try fid = fopen(fileName, ’rb‘); bytes = fread(fid); fclose(fid); % ——————————————————————- % First method % ——————————————————————- encoder = org。apache。commons。codec。binary。Base64; base64string = char(encoder。encode(bytes))’; % ——————————————————————- % Second method % ——————————————————————- % base64string = matlab。net。base64encode(bytes); catch disp(‘The file does not exist!’); base64string = ‘’; end % end try end % end function

使用Python base64模組

Matlab中可以直接使用Python,那Python中提供的模組base64就可以直接使用了,原始碼如下:

function base64string = img2base64_(fileName) %IMG2BASE64 Coding an image to base64 file % INPUTS: % fileName string, an image file name % OUTPUTS: % base64string string, the input image‘s base64 code % USAGE: % >>base64string = img2base64(’1。jpg‘) % >>base64string = ’xxx‘ % try f = py。open(fileName, ’rb‘); bytes = f。read(); f。close(); temp = char(py。base64。b64encode(bytes)); temp = regexp(temp, ’(?<=b‘’)。+(?=‘’)‘, ’match‘); base64string = temp{1}; catch disp(’The file does not exist!‘); base64string = ’‘; end % end try end % end function

我們可以對如下所示的同一張圖片(500 x 500)進行base64編碼,比較一下編碼速度:

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

結果:

'/9j/4AAQSkZ...AAAAAAD/9k='

Java類——-org。apache。commons。codec。binary。Base64 ⏲ 0。000783 秒

matlab。net。base64encode ⏲ 0。017589 秒

Python base64模組 ⏲ 0。000709 秒

可以發現使用Java類和Python base64模組的方法,速度相當,而使用matlab。net。base64encode速度要慢20多倍,但編碼一張大小為500 x 500的影象耗時0。02秒左右,其速度是非常之快了。

綜合一下,我們推薦使用org。apache。commons。codec。binary。Base64類進行base64編碼。

3。 螢幕截圖

識別掃描版pdf文件、影片教程等中的文字時,我們需要對待識別文字所在區域截個圖,儲存為影象再進行後續識別操作。要實現上述過程,首先需要對螢幕進行截圖,Matlab透過藉助java。awt。Robot這個Java類來實現,截圖原始碼如下所示:

function imgData = screenSnipping %screenSnipping Capturel full-screen to an image % Output: % imgData, uint8, image data。 % Source code from: https://www。mathworks。com/support/search。html/answers/362358-how-do-i-take-a-screenshot-using-matlab。html?fq=asset_type_name:answer%20category:matlab/audio-and-video&page=1 % Modified: Qingpinwangzi % Date: Apr 14, 2021。 % Take screen capture robo = java。awt。Robot; tk = java。awt。Toolkit。getDefaultToolkit(); rectSize = java。awt。Rectangle(tk。getScreenSize()); cap = robo。createScreenCapture(rectSize); % Convert to an RGB image rgb = typecast(cap。getRGB(0, 0, cap。getWidth, cap。getHeight, [], 0, cap。getWidth), ’uint8‘); imgData = zeros(cap。getHeight, cap。getWidth, 3, ’uint8‘); imgData(:, :, 1) = reshape(rgb(3:4:end), cap。getWidth, [])’; imgData(:, :, 2) = reshape(rgb(2:4:end), cap。getWidth, [])‘; imgData(:, :, 3) = reshape(rgb(1:4:end), cap。getWidth, [])’; end

4。 呼叫百度API識別文字

上述第1節中我們提到過,access_token是呼叫API介面的必需引數。透過閱讀技術文件得知,需要API Key和Secret Key進行http請求就可以獲得,核心程式碼如下:

url = [‘https://aip。baidubce。com/oauth/2。0/token?grant_type=client_credentials&client_id=’, apiKey, ‘&client_secret=’, secretKey]; res = webread(url, options); access_token = res。access_token;

有了access_token我們就可以呼叫文字識別API進行文字識別了,這裡再分享下識別文字的原始碼:

function result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL, outType) %GETWORDSBYBAIDUOCR return recognition words % INPUTS: % fileName string, an image file name % apiKey string, the API Key of the application % secretKey string, The Secret Key of the application % accessToken string, default is ‘’, get the Access Token by API % Key and Secret Key。 % apiURL string, such as: % ‘https://aip。baidubce。com/rest/2。0/ocr/v1/accurate’ % ‘https://aip。baidubce。com/rest/2。0/ocr/v1/accurate_basic’ % ‘https://aip。baidubce。com/rest/2。0/ocr/v1/general_basic’ % outType, ‘MultiLine|SingleLine’ % OUTPUTS: % result []|struct % USAGE: % >>result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL) % Date: Mar 18, 2021。 % Author: 清貧王子 % options = weboptions(‘RequestMethod’, ‘post’); if isempty(outType) outType = ‘MultiLine’; end if isempty(accessToken) url = [‘https://aip。baidubce。com/oauth/2。0/token?grant_type=client_credentials&client_id=’, apiKey, ‘&client_secret=’, secretKey]; res = webread(url, options); access_token = res。access_token; else access_token = accessToken; end % end if url = [apiURL, ‘?access_token=’, access_token]; options。HeaderFields = { ‘Content-Type’, ‘application/x-www-form-urlencoded’}; imgBase64String = img2base64(fileName); if isempty(imgBase64String) result = ‘’; return end % end if res = webwrite(url, ‘image’, imgBase64String, options); wordsRsult = res。words_result; data。ocrResultChar = ‘’; if strcmp(outType, ‘SingleLine’) for ii = 1 : size(wordsRsult, 1) data。ocrResultChar = [data。ocrResultChar, wordsRsult(ii,1)。words]; end % end for elseif strcmp(outType, ‘MultiLine’) for ii = 1 : size(wordsRsult, 1) data。ocrResultChar{ii} = wordsRsult(ii,1)。words; end % end for end result = data。ocrResultChar; end % end function

簡單測試下這個函式,輸入下面所示的圖片,我們進行圖片(截圖地址:https://ww2。mathworks。cn/products/matlab/app-designer。html)中的文字識別。

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

>> result = 1×7 cell 陣列 列 1 至 4 {‘App設計工具幫助您…’} {‘開發專業背景。您只…’} {‘面(GUI)設計佈局,…’} {‘程式設計。’} 列 5 至 7 {‘要共享App,您可以使…’} {‘ MATLAB Compile…’} {‘桌面App或 Web App’} >> result{1} ans = ‘App設計工具幫助您建立專業的App,同時並不要求軟體’

識別結果中共有7個cell,代表識別了圖片中的7行文字,即1個cell對應1行識別的文字,如result{1}的結果。

三、工具搭建

以基於uifigure的程式設計方式建立APP,我們推薦面向物件(OOP)方法程式設計,簡單起見,這裡主要封裝一個類來實現所需的功能。當然更標準的做法是利用MVC等設計模式將介面和邏輯分離,能達到對擴充套件開放,對修改封閉的軟體設計原則。

1。 功能需求

我們的功能需求非常簡單,主要有以下兩個功能:

識別已經存在的影象中的文字

識別掃描版pdf文件、影片教程等中的文字

實現第1個功能,我們只需要載入影象,然後呼叫識別函式進行識別,將識別結果顯示到文字區域就可以了;而實現第2個功能,首先需要螢幕截圖,選取待識別文字所在的區域,儲存為影象,後續處理和實現第1個功能的一樣。

根據上述描述,我們需要的控制元件有:載入影象按鈕,截圖按鈕,影象顯示器,識別結果顯示文字域。另外,需要一個清理按鈕,用於清除顯示的影象和識別結果;還需要一個設定按鈕,用於配置API Key和Secret Key。

便於敘述,我們先展示下最終設計的結果,如下圖所示:

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

文字識別工具主介面

設定介面

在設定介面中,需要兩個標籤和兩個文字框,兩外需要兩個按鈕。據此,我們需要的控制元件都清楚了,接下來讓我們一起來建立他們吧!

2。 實現細節

主要封裝一個類來實現所需的功能,我們給這個類起個名:ReadWords,這個類需要繼承matlab。apps。AppBase,它的屬性就是介面中的所有控制元件,那麼這個類看上去應該是這樣的:

classdef ReadWords < matlab。apps。AppBase %% properties UIFig matlab。ui。Figure ContainerForMain matlab。ui。container。GridLayout ThisTB matlab。ui。container。Toolbar SnippingToolBtn matlab。ui。container。toolbar。PushTool ImgLoadToolBtn matlab。ui。container。toolbar。PushTool SetupToolBtn matlab。ui。container。toolbar。PushTool CleanToolBtn matlab。ui。container。toolbar。PushTool ImgShow matlab。ui。control。Image WordsShowTA matlab。ui。control。TextArea ContainerForSetup matlab。ui。container。GridLayout APIKeyText matlab。ui。control。EditField SecrectKeyText matlab。ui。control。EditField ResetBtn matlab。ui。control。Button SaveBtn matlab。ui。control。Button end % end properties %% properties(Hidden, Dependent) APIKeyVal SecrectKeyVal end % end properties %% properties(Access = protected) HasSetup = false end % end properties end % end classdef

下面說明下一些重要的屬性

公有屬性:

UIFig

必須是matlab。ui。Figure類的屬性,透過uifigure構造,這是整個工具的主視窗

ContainerForMain

必須是matlab。ui。container。GridLayout類的屬性,透過uigridlayout構造,這是主視窗的佈局容器

ThisTB

必須是matlab。ui。container。Toolbar類的屬性,透過uitoolbar構造,這是工具欄的容器,用於放置SnippingToolBtn、ImgLoadToolBtn、SetupToolBtn、CleanToolBtn這4個工具按鈕

ImgShow

必須是matlab。ui。control。Image類的屬性,透過uiimage構造,用於顯示載入或者截圖後的影象

WordsShowTA

必須是matlab。ui。control。TextArea類的屬性,透過uitextarea構造,用於顯示文字識別結果

ContainerForSetup

設定介面中的網格容器

APIKeyText

SecrectKeyText

主要用於輸入APIKey和SecrectKey

ResetBtn

SaveBtn

兩個按鈕分別用來實現重置和儲存APIKey和SecrectKey

從屬、隱藏屬性:

APIKeyVal

用於接收

APIKeyText

中輸入的APIKey的值

SecrectKeyVal

用於接收

SecrectKeyText

中輸入的SecrectKey的值

受保護屬性:

HasSetup

用於標識是否配置了APIKey和SecrectKey,預設為false

至此,我們設定好了所有的屬性,然後進行構造方法、析構方法以及類方法的編寫。

加上構造方法、析構方法以及從屬屬性APIKeyVal和SecrectKeyVal的get方法的程式碼後看上去是這樣的:

classdef ReadWords < matlab。apps。AppBase %% properties UIFig matlab。ui。Figure ContainerForMain matlab。ui。container。GridLayout ThisTB matlab。ui。container。Toolbar SnippingToolBtn matlab。ui。container。toolbar。PushTool ImgLoadToolBtn matlab。ui。container。toolbar。PushTool SetupToolBtn matlab。ui。container。toolbar。PushTool CleanToolBtn matlab。ui。container。toolbar。PushTool ImgShow matlab。ui。control。Image WordsShowTA matlab。ui。control。TextArea ContainerForSetup matlab。ui。container。GridLayout APIKeyText matlab。ui。control。EditField SecrectKeyText matlab。ui。control。EditField ResetBtn matlab。ui。control。Button SaveBtn matlab。ui。control。Button end % end properties %% properties(Hidden, Dependent) APIKeyVal SecrectKeyVal end % end properties %% properties(Access = protected) HasSetup = false end % end properties %% methods % ———————————————————— % % Constructor % ———————————————————— function app = ReadWords % Create UIFigure and components app。buildApp(); % Register the app with App Designer registerApp(app, app。UIFig) if nargout == 0 clear app end end % end Constructor % ———————————————————— % % Destructor % ———————————————————— % Code that executes before app deletion function delete(app) % Delete UIFigure when app is deleted delete(app。UIFig) end % end Constructor % ———————————————————— % % Get/Set methods % ———————————————————— % get。APIKeyVal function apiKeyVal = get。APIKeyVal(app) apiKeyVal = app。APIKeyText。Value; end % get。SecrectKeyVal function secrectKeyVal = get。SecrectKeyVal(app) secrectKeyVal = app。SecrectKeyText。Value; end end % end methods end % end classdef

析構方法(Destructor)的寫法是固定的,構造方法中的registerApp(app, app。UIFig)也是固定的,另外的buildApp()方法就用來建立介面、註冊各個控制元件。

我們將後續的方法都建立為私有方法,添加了buildApp()方法後的整個ReadWords類是下面這樣的:

classdef ReadWords < matlab。apps。AppBase %% properties UIFig matlab。ui。Figure ContainerForMain matlab。ui。container。GridLayout ThisTB matlab。ui。container。Toolbar SnippingToolBtn matlab。ui。container。toolbar。PushTool ImgLoadToolBtn matlab。ui。container。toolbar。PushTool SetupToolBtn matlab。ui。container。toolbar。PushTool CleanToolBtn matlab。ui。container。toolbar。PushTool ImgShow matlab。ui。control。Image WordsShowTA matlab。ui。control。TextArea ContainerForSetup matlab。ui。container。GridLayout APIKeyText matlab。ui。control。EditField SecrectKeyText matlab。ui。control。EditField ResetBtn matlab。ui。control。Button SaveBtn matlab。ui。control。Button end % end properties %% properties(Hidden, Dependent) APIKeyVal SecrectKeyVal end % end properties %% properties(Access = protected) HasSetup = false end % end properties %% methods % ———————————————————— % % Constructor % ———————————————————— function app = ReadWords % Create UIFigure and components app。buildApp(); % Register the app with App Designer registerApp(app, app。UIFig) if nargout == 0 clear app end end % end Constructor % ———————————————————— % % Destructor % ———————————————————— % Code that executes before app deletion function delete(app) % Delete UIFigure when app is deleted delete(app。UIFig) end % end Constructor % ———————————————————— % % Get/Set methods % ———————————————————— % get。APIKeyVal function apiKeyVal = get。APIKeyVal(app) apiKeyVal = app。APIKeyText。Value; end % get。SecrectKeyVal function secrectKeyVal = get。SecrectKeyVal(app) secrectKeyVal = app。SecrectKeyText。Value; end end % end methods %% methods(Access = private) % buildApp function buildApp(app) % % ———————————————————— % % Main Figure % ———————————————————— app。UIFig = uifigure(); app。UIFig。Icon = ‘icons/img2text。png’; app。UIFig。Name = ‘ReadWords’; app。UIFig。Visible = ‘off’; app。UIFig。Position = [app。UIFig。Position(1), app。UIFig。Position(2), 745, 420]; app。UIFig。AutoResizeChildren = ‘on’; app。UIFig。Units = ‘Normalized’; app。setAutoResize(app。UIFig, true); % ———————————————————— % % Toolbar % ———————————————————— app。ThisTB = uitoolbar(app。UIFig); % SetupToolBtn app。SetupToolBtn = uipushtool(app。ThisTB); app。SetupToolBtn。Icon = ‘icons/setup。png’; app。SetupToolBtn。Tooltip = ‘Setup’; % SnippingToolBtn app。SnippingToolBtn = uipushtool(app。ThisTB); app。SnippingToolBtn。Icon = ‘icons/snip。png’; app。SnippingToolBtn。Tooltip = ‘Screenshot’; % ImgLoadToolBtn app。ImgLoadToolBtn = uipushtool(app。ThisTB); app。ImgLoadToolBtn。Icon = ‘icons/load。png’; app。ImgLoadToolBtn。Tooltip = ‘Load image’; % CleanToolBtn app。CleanToolBtn = uipushtool(app。ThisTB); app。CleanToolBtn。Icon = ‘icons/clean。png’; app。CleanToolBtn。Tooltip = ‘Clean’; % ———————————————————— % % ContainerForMain % ———————————————————— app。ContainerForMain = uigridlayout(app。UIFig, [1, 2]); % ContainerForMain imgShowPanel = uipanel(app。ContainerForMain, ‘Title’, ‘Original’); resultShowPanel = uipanel(app。ContainerForMain, ‘Title’, ‘Result’); % ImgShow imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]); imgShowPanelLay。RowSpacing = 0; imgShowPanelLay。ColumnSpacing = 0; app。ImgShow = uiimage(imgShowPanelLay); % WordsShowTA resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]); resultShowPanelLay。RowSpacing = 0; resultShowPanelLay。ColumnSpacing = 0; app。WordsShowTA = uitextarea(resultShowPanelLay); app。WordsShowTA。FontSize = 22; % ———————————————————— % % ContainerForSetup % ———————————————————— app。ContainerForSetup = uigridlayout(app。UIFig, [4, 3]); app。ContainerForSetup。RowHeight = {22, 22, 22, ‘1x’}; app。ContainerForSetup。ColumnWidth = {‘1x’, ‘1x’, ‘2。5x’}; app。ContainerForSetup。Visible = ‘off’; apiKeyLabel = uilabel(app。ContainerForSetup, ‘Text’, ‘API Key’); apiKeyLabel。HorizontalAlignment = ‘right’; apiKeyLabel。Layout。Row = 1; apiKeyLabel。Layout。Column = 1; % APIKeyText app。APIKeyText = uieditfield(app。ContainerForSetup); app。APIKeyText。Layout。Row = 1; app。APIKeyText。Layout。Column = 2; secrectKeyLabel = uilabel(app。ContainerForSetup, ‘Text’, ‘Secrect Key’); secrectKeyLabel。HorizontalAlignment = ‘right’; secrectKeyLabel。Layout。Row = 2; secrectKeyLabel。Layout。Column = 1; % SecrectKeyText app。SecrectKeyText = uieditfield(app。ContainerForSetup); app。SecrectKeyText。Layout。Row = 2; app。SecrectKeyText。Layout。Column = 2; % ResetBtn app。ResetBtn = uibutton(app。ContainerForSetup, ‘Text’, ‘Reset’); app。ResetBtn。Layout。Row = 3; app。ResetBtn。Layout。Column = 1; % SaveBtn app。SaveBtn = uibutton(app。ContainerForSetup, ‘Text’, ‘Save’); app。SaveBtn。Layout。Row = 3; app。SaveBtn。Layout。Column = 2; % Set visibility for UIFig movegui(app。UIFig, ‘center’); app。UIFig。Visible = ‘on’; % ———————————————————— % % RunstartupFcn % ———————————————————— app。runStartupFcn(@startupFcn); end % end buildApp end % methodsend % end classdef

需要注意的是,工具欄按鈕和視窗的圖示來源於:https://www。easyicon。cc/。一些常見的圖示素材都可以從中免費下載。我們已經將圖示下載完畢,需要的朋友可以點選下方連結來下載:

連結:https://pan。baidu。com/s/11kIvt4SX-MhQ2ltEeC18ZA 提取碼:5i3k

另外,app。runStartupFcn(@startupFcn);語句呼叫的是父類matlab。apps。AppBase的方法,我們將各個控制元件的註冊任務放在startupFcn這個方法中完成。這裡不妨先註釋掉這個語句,直接執行ReadWords。m便可以顯示出我們剛才在buildApp方法中構造的介面了,動圖演示如下:

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

可以看到,我們在點選工具欄各個按鈕時,沒有反應,這是因為到目前為止我們還沒有給各個控制元件註冊回撥方法,那接下來將會在startupFcn這個方法中完成各個控制元件的註冊任務,程式碼如下:

classdef ReadWords < matlab。apps。AppBase %% properties UIFig matlab。ui。Figure ContainerForMain matlab。ui。container。GridLayout ThisTB matlab。ui。container。Toolbar SnippingToolBtn matlab。ui。container。toolbar。PushTool ImgLoadToolBtn matlab。ui。container。toolbar。PushTool SetupToolBtn matlab。ui。container。toolbar。PushTool CleanToolBtn matlab。ui。container。toolbar。PushTool ImgShow matlab。ui。control。Image WordsShowTA matlab。ui。control。TextArea ContainerForSetup matlab。ui。container。GridLayout APIKeyText matlab。ui。control。EditField SecrectKeyText matlab。ui。control。EditField ResetBtn matlab。ui。control。Button SaveBtn matlab。ui。control。Button end % end properties %% properties(Hidden, Dependent) APIKeyVal SecrectKeyVal end % end properties %% properties(Access = protected) HasSetup = false end % end properties %% methods % ———————————————————— % % Constructor % ———————————————————— function app = ReadWords % Create UIFigure and components app。buildApp(); % Register the app with App Designer registerApp(app, app。UIFig) if nargout == 0 clear app end end % end Constructor % ———————————————————— % % Destructor % ———————————————————— % Code that executes before app deletion function delete(app) % Delete UIFigure when app is deleted delete(app。UIFig) end % end Constructor % ———————————————————— % % Get/Set methods % ———————————————————— % get。APIKeyVal function apiKeyVal = get。APIKeyVal(app) apiKeyVal = app。APIKeyText。Value; end % get。SecrectKeyVal function secrectKeyVal = get。SecrectKeyVal(app) secrectKeyVal = app。SecrectKeyText。Value; end end % end methods %% methods(Access = private) % buildApp function buildApp(app) % % ———————————————————— % % Main Figure % ———————————————————— app。UIFig = uifigure(); app。UIFig。Icon = ‘icons/img2text。png’; app。UIFig。Name = ‘ReadWords’; app。UIFig。Visible = ‘off’; app。UIFig。Position = [app。UIFig。Position(1), app。UIFig。Position(2), 745, 420]; app。UIFig。AutoResizeChildren = ‘on’; app。UIFig。Units = ‘Normalized’; app。setAutoResize(app。UIFig, true); % ———————————————————— % % Toolbar % ———————————————————— app。ThisTB = uitoolbar(app。UIFig); % SetupToolBtn app。SetupToolBtn = uipushtool(app。ThisTB); app。SetupToolBtn。Icon = ‘icons/setup。png’; app。SetupToolBtn。Tooltip = ‘Setup’; % SnippingToolBtn app。SnippingToolBtn = uipushtool(app。ThisTB); app。SnippingToolBtn。Icon = ‘icons/snip。png’; app。SnippingToolBtn。Tooltip = ‘Screenshot’; % ImgLoadToolBtn app。ImgLoadToolBtn = uipushtool(app。ThisTB); app。ImgLoadToolBtn。Icon = ‘icons/load。png’; app。ImgLoadToolBtn。Tooltip = ‘Load image’; % CleanToolBtn app。CleanToolBtn = uipushtool(app。ThisTB); app。CleanToolBtn。Icon = ‘icons/clean。png’; app。CleanToolBtn。Tooltip = ‘Clean’; % ———————————————————— % % ContainerForMain % ———————————————————— app。ContainerForMain = uigridlayout(app。UIFig, [1, 2]); % ContainerForMain imgShowPanel = uipanel(app。ContainerForMain, ‘Title’, ‘Original’); resultShowPanel = uipanel(app。ContainerForMain, ‘Title’, ‘Result’); % ImgShow imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]); imgShowPanelLay。RowSpacing = 0; imgShowPanelLay。ColumnSpacing = 0; app。ImgShow = uiimage(imgShowPanelLay); % WordsShowTA resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]); resultShowPanelLay。RowSpacing = 0; resultShowPanelLay。ColumnSpacing = 0; app。WordsShowTA = uitextarea(resultShowPanelLay); app。WordsShowTA。FontSize = 22; % ———————————————————— % % ContainerForSetup % ———————————————————— app。ContainerForSetup = uigridlayout(app。UIFig, [4, 3]); app。ContainerForSetup。RowHeight = {22, 22, 22, ‘1x’}; app。ContainerForSetup。ColumnWidth = {‘1x’, ‘1x’, ‘2。5x’}; app。ContainerForSetup。Visible = ‘off’; apiKeyLabel = uilabel(app。ContainerForSetup, ‘Text’, ‘API Key’); apiKeyLabel。HorizontalAlignment = ‘right’; apiKeyLabel。Layout。Row = 1; apiKeyLabel。Layout。Column = 1; % APIKeyText app。APIKeyText = uieditfield(app。ContainerForSetup); app。APIKeyText。Layout。Row = 1; app。APIKeyText。Layout。Column = 2; secrectKeyLabel = uilabel(app。ContainerForSetup, ‘Text’, ‘Secrect Key’); secrectKeyLabel。HorizontalAlignment = ‘right’; secrectKeyLabel。Layout。Row = 2; secrectKeyLabel。Layout。Column = 1; % SecrectKeyText app。SecrectKeyText = uieditfield(app。ContainerForSetup); app。SecrectKeyText。Layout。Row = 2; app。SecrectKeyText。Layout。Column = 2; % ResetBtn app。ResetBtn = uibutton(app。ContainerForSetup, ‘Text’, ‘Reset’); app。ResetBtn。Layout。Row = 3; app。ResetBtn。Layout。Column = 1; % SaveBtn app。SaveBtn = uibutton(app。ContainerForSetup, ‘Text’, ‘Save’); app。SaveBtn。Layout。Row = 3; app。SaveBtn。Layout。Column = 2; % Set visibility for UIFig movegui(app。UIFig, ‘center’); app。UIFig。Visible = ‘on’; % ———————————————————— % % RunstartupFcn % ———————————————————— app。runStartupFcn(@startupFcn); end % end buildApp % startupFcn function startupFcn(app, ~, ~) % Setup APIKeyText and SecrectKeyText if exist(‘apikey。mat’, ‘file’) temp = load(‘apikey。mat’); app。APIKeyText。Value = temp。key。apiKeyVal; app。APIKeyText。Editable = ‘off’; app。SecrectKeyText。Value = temp。key。secrectKeyVal; app。SecrectKeyText。Editable = ‘off’; end % Register callback app。SnippingToolBtn。ClickedCallback = @app。clickedSnippingToolBtn; app。ImgLoadToolBtn。ClickedCallback = @app。clickedImgLoadToolBtn; app。SetupToolBtn。ClickedCallback = @app。clickedSetupToolBtn; app。CleanToolBtn。ClickedCallback = @app。clickedCleanToolBtn; app。ResetBtn。ButtonPushedFcn = @app。callbackResetBtn; app。SaveBtn。ButtonPushedFcn = @app。callbackSaveBtn; end % end function end % methodsend % end classdef

由此,我們總共為6個按鈕註冊了6個回撥方法,需要都進行實現,不然觸發按鈕時,該按鈕不會做出響應。簡單起見,這裡我們以實現設定介面中的SaveBtn的回撥方法callbackSaveBtn為例子來說明。

在沒有設定APIKey或SecrectKey前,觸發SnippingToolBtn或者ImgLoadToolBtn會有先進行設定的提示:

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

callbackSaveBtn方法實現的邏輯

:首先由HasSetup屬性判斷是否進行了APIKey和SecrectKey的設定(初始預設是false沒有設定),如果沒有設定,會提示沒有APIKey或SecrectKey,則需要輸入APIKey和SecrectKey的值,然後點選儲存按鈕,那麼後臺會將獲取到的值儲存下來(。mat檔案),更新HasSetup的值為true,後續我們就不必要再次輸入了,要想更換值的話,點選重置按鈕重新配置即可;如果進行了設定(HasSetup屬性為true),直接儲存即可。

具體的程式碼如下:

% ————————————————————% % Callback functions% ————————————————————% callbackSaveBtnfunction callbackSaveBtn(app, ~, ~) if ~isempty(app。SecrectKeyText。Value) && ~isempty(app。APIKeyText。Value) key。apiKeyVal = app。APIKeyText。Value; key。secrectKeyVal = app。SecrectKeyText。Value; if exist(‘apikey。mat’, ‘file’) delete(‘apikey。mat’); end save(‘apikey。mat’, ‘key’); !attrib +s +h apikey。mat uialert(app。UIFig, ‘Save successfully!’, ‘Confirm’, ‘Icon’, ‘success’); app。APIKeyText。Editable = ‘off’; app。SecrectKeyText。Editable = ‘off’; else uialert(app。UIFig, ‘API Key or Secrect Key is empty!’, ‘Confirm’, ‘Icon’, ‘warning’); end % end ifend % callbackSaveBtn

實現了儲存按鈕的功能後,就可以得到如下動圖所示的效果了。

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

其他的回撥函式原始碼:

% clickedSnippingToolBtnfunction clickedSnippingToolBtn(app, ~, ~) if ~isempty(app。SecrectKeyText。Value) && ~isempty(app。APIKeyText。Value) app。UIFig。Visible = ‘off’; pause(0。1); outFileName = ‘temp。png’; cropImg(outFileName); !attrib +s +h temp。png % app。ImgShow。ImageSource = imread(outFileName); app。UIFig。Visible = ‘on’; % apiURL = ‘https://aip。baidubce。com/rest/2。0/ocr/v1/accurate_basic’; words = getWordsByBaiduOCR(outFileName, app。APIKeyVal, app。SecrectKeyVal, ‘’, apiURL, ‘MultiLine’); app。WordsShowTA。Value = words; else msg = {‘API Key or Secrect Key is empty!’; ‘Please set it up first!’}; uialert(app。UIFig, msg, ‘Confirm’, ‘Icon’, ‘warning’); endend % end clickedSnippingToolBtn% clickedImgLoadToolBtnfunction clickedImgLoadToolBtn(app, ~, ~) if ~isempty(app。SecrectKeyText。Value) && ~isempty(app。APIKeyText。Value) [fName, fPath] = uigetfile({‘*。png’; ‘*。jpg’; ‘*。bmp’; ‘*。tif’}, ‘Open image’); if ~isequal(any([fName, fPath]), 0) img = imread(strcat(fPath, fName)); outFileName = ‘temp。png’; if exist(outFileName, ‘file’) delete(outFileName) end imwrite(img, outFileName); !attrib +s +h temp。png % app。ImgShow。ImageSource = imread(outFileName); app。UIFig。Visible = ‘on’; % apiURL = ‘https://aip。baidubce。com/rest/2。0/ocr/v1/accurate_basic’; words = getWordsByBaiduOCR(outFileName, app。APIKeyVal, app。SecrectKeyVal, ‘’, apiURL, ‘MultiLine’); app。WordsShowTA。Value = words; else return end % end if else % end if msg = {‘API Key or Secrect Key is empty!’; ‘Please set it up first!’}; uialert(app。UIFig, msg, ‘Confirm’, ‘Icon’, ‘warning’); endend % end clickedImgLoadToolBtn% clickedSetupToolBtnfunction clickedSetupToolBtn(app, ~, ~) if ~app。HasSetup app。ContainerForMain。Visible = ‘off’; app。ContainerForSetup。Visible = ‘on’; app。HasSetup = true; else app。ContainerForMain。Visible = ‘on’; app。ContainerForSetup。Visible = ‘off’; app。HasSetup = false; endend % end clickedSetupToolBtn% clickedCleanToolBtnfunction clickedCleanToolBtn(app, ~, ~) app。WordsShowTA。Value = ‘’; app。ImgShow。ImageSource = ‘’;end % end clickedCleanToolBtn% callbackResetBtnfunction callbackResetBtn(app, ~, ~) app。APIKeyText。Value = ‘’; app。APIKeyText。Editable = ‘on’; app。SecrectKeyText。Value = ‘’; app。SecrectKeyText。Editable = ‘on’;end % callbackResetBtn

四、使用演示

現在讓我們來測試一下搭建的影象識別工具吧,比如,某麻子同學是一名研究生,在閱讀那種掃描版的pdf文獻時,想把其中的一段語句複製下來用於記錄筆記或者做PPT用,這時我們的工具就派上用場了:

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

剎那間,某麻子同學得到了想要的結果,露出了久違的幸福的一笑!

手把手教你,一個案例學會用Matlab App Designer設計文字識別工具

五、結語

至此,我們完成了一個比較完整的文字識別工具!希望您喜歡,並且可以從中獲得有用的東西。

本文完整程式碼,請留言討論。

【往期推薦】

矩陣2-範數化的向量化方法 (qq。com)

texStudio主題配置 (qq。com)

送福利啦 (qq。com)

如何用Matlab一鍵下載B站高畫質影片(下) (qq。com)

Python中的裝飾器 (qq。com)

MATLAB 風格指南 2。0 (qq。com)

匿名函式(Anonymous Function) (qq。com)

猜猜今天的乾貨有哪些? (qq。com)

分享爬取Matlab中文論壇基礎討論的原始碼 (qq。com)

如何用Matlab一鍵下載B站高畫質影片(上) (qq。com)

爬取某學者主頁上的文獻 (qq。com)