Google ML Kit分析
文章目录
1. ML Kit简介
在 Google I/O 2018 上,Google 发布了 ML Kit。通过 ML Kit,即使没有机器学习背景的开发者也可以快速开发出一些基于机器学习的项目。调用 ML Kit 的 API 就像调用其他移动原生 SDK 一样简单。简单的文字识别如下所示:
|
|
机器学习套件是一个移动 SDK,它通过一个强大且易用的软件包将 Google 的机器学习专业技术应用到Android 和 iOS 应用中。无论您是刚开始接触机器学习,还是拥有丰富的相关经验,只需几行代码即可实现所需的功能。您无需具备丰富的神经网络或模型优化知识即可开始使用该套件。另一方面,如果您是一名经验丰富的机器学习开发者,机器学习套件提供了便捷的 API,可帮助您在移动应用中使用自定义 TensorFlow Lite 模型。
目前ML Kit提供的功能如下:
2. ML Kit架构
首先,我们注意到,ML Kit是FireBase的一部分,FireBase是一个用于构建移动应用、提供实时数据存储和同步、用户身份验证等功能的平台。
Firebase成立于2011年,在被Google收购之前,Firebase是一个协助开发者快速构建App,能够提供移动应用专用开发平台及SDK的一款产品,简单的说就是一套集成后台服务工具。在2014年,谷歌收购了Firebase,在2018年的Google I/O大会上,谷歌发表了新版的Firebase,新的Firebase整合了Google既有的云端服务与工具,扩大支援更全面的功能,涵盖开发、成长与营收三阶段,并整合分析工具,其分析工具专为App所设计,并以事件和使用者分析为主。在开发阶段所提供的后端服务,包括即时资料库(Realtime database)、身份验证(Authentication)、主机(Hosting)、储存(Storage)、云端讯息(Cloud Messaging)及远端配置(Remote Config)等后端服务,并提供Android测试实验室(Test Lab for Android),以及当机报告等管理App品质的服务。
FireBase平台目前提供的功能列表
Ml Kit收费情况,高级功能是收费的
ML Kit从功能上可分为三部分,视觉,自然语言处理,自定义模型。
ML Kit运行堆栈,依赖mobile vision,cloud vision,tf lite实现。底层在Android上是通过NN API实现的,IOS上是通过metal实现的。
3. ML Kit发展历程
3.1 mobile vision
mobile vision包含在Google GMS组件内,发布于2017年,目前mobile vision已被更加强大的ML Kit代替。
Mobile Vision API 包含四大组件:共用 API(Common Utility API)以及上面介绍的三种特定应用 API。共用 API 提供构建应用通道的基础设施与模块。
1.Barcode API
- 支持 1D 条形码和 2D 二维码类型
- 支持多种条形码格式
- 应用场景:跟踪并识别任意条形码或二维码
2.Face API
- 捕捉脸部图像,支持不同的角度以及非常夸张的表情
- 生成 Facial Landmarks 用于定位等业务
- 脸部表情分类
- 应用场景:生成用户趣味头像,识别商品并推荐购买方式,等等等等
3.Text API
- 支持 20 种以上拉丁语系语言
- 支持段、句、词分析
- 应用场景:信用卡信息提取,名片信息提取,实时翻译,等等等等
SDK包名:
Common functionality: com.google.android.gms.vision
Face detector: com.google.android.gms.vision.face
Barcode detector: com.google.android.gms.vision.barcode
Text detector: com.google.android.gms.vision.text
Google Mobile Vision (GMV) 同时支持 iOS 与 Anriod 平台,用户只需根据 API 与业务需求简单地设定以下三个类即可构建完整的图像处理 Pipeline:设定 Detector 类用于捕捉图像内容;设定 Processor 类允许用户灵活地处理单张或多张图像(即 Focus 模式与 Multi 模式);最后用户只需完整地重构 Tracker,根据图像信息完成业务逻辑。无论是 iOS 还是 Andriod,遵循以上流程即可构建 GMV 应用。
上图是完整的流程。摄像头源内部使用了 Camera API,它将图像帧传递给检测器,检测器运行算法来生成检测结果。然后结果被传递给处理器。处理器是首个后处理(post-processing)步骤,它负责筛除、合并、或传递检测到的 item 到相关 Tracker。
总结来就是两个步骤:
- 置追踪管道
- 部署 Tracker 实时追踪时间变化。
图中的 Camera Source、Detector、Processor 都由 Mobile Vision API 提供,用户要做的就是基于自己商业逻辑编写代码,实现 Tracker。
GMS vision -> FireBase ML Kit变化
mobile vision | ML Kit 2018 | ML Kit 2019 | |
---|---|---|---|
状态 | 废弃中 | 首次发布 | beta |
功能 | 条形码扫描人脸检测文本识别 | 条形码扫描人脸检测文本识别图像打标地标识别 | 条形码扫描人脸检测文字识别图像打标对象检测和跟踪地标识别语言识别翻译智能回复AutoML 模型推断自定义模型推断 |
云端支持 | 不支持 | 不支持 | 支持 |
收费 | 免费 | 免费 | 高级功能消耗云端资源,收费 |
应用数量 | 多 | 非常少 | 逐步增多 |
4. ML Kit分析
4.1 文件
ML Kit的开发包和google的其他sdk包一样,放在Google’s Maven仓库里,在android工程build.gradle文件里添加引用即可使用。
模块 | sdk包名 |
---|---|
ML Kit: Vision APIs | com.google.firebase:firebase-ml-vision:24.0.1 |
ML Kit: Image Labeling Model | com.google.firebase:firebase-ml-vision-image-label-model:19.0.0 |
ML Kit: Object Detection and Tracking Model | com.google.firebase:firebase-ml-vision-object-detection-model:19.0.3 |
ML Kit: Face Detection Model | com.google.firebase:firebase-ml-vision-face-model:19.0.0 |
ML Kit: Barcode Scanning Model | com.google.firebase:firebase-ml-vision-barcode-model:16.0.2 |
ML Kit: AutoML Vision Edge API | com.google.firebase:firebase-ml-vision-automl:18.0.3 |
ML Kit: Natural Language APIs | com.google.firebase:firebase-ml-natural-language:22.0.0 |
ML Kit: Language Identification Model | com.google.firebase:firebase-ml-natural-language-language-id-model:20.0.7 |
ML Kit: Translate Model | com.google.firebase:firebase-ml-natural-language-translate-model:20.0.7 |
ML Kit: Smart Reply Model | com.google.firebase:firebase-ml-natural-language-smart-reply-model:20.0.7 |
ML Kit: Custom Model APIs | com.google.firebase:firebase-ml-model-interpreter:22.0.1 |
4.2 API接口
4.2.1 com.google.firebase.ml.common
Annotations
FirebaseMLException.Code | The set of Firebase ML status codes. |
---|---|
Exceptions
FirebaseMLException | A class of exceptions thrown by Firebase Machine Learning |
---|---|
4.2.1.1 com.google.firebase.ml.common.modeldownload
Classes
FirebaseLocalModel | Describes a local model created from local or asset files. |
---|---|
FirebaseLocalModel.Builder | Builder class of FirebaseLocalModel . |
FirebaseModelDownloadConditions | Conditions to download remote models. |
FirebaseModelDownloadConditions.Builder | Builder of FirebaseModelDownloadConditions . |
FirebaseModelManager | Manages the registration of remote and local models. |
FirebaseRemoteModel | Describes a remote model to be downloaded to the device. |
FirebaseRemoteModel.Builder | Builder of FirebaseRemoteModel . |
4.2.2 com.google.firebase.ml.custom
Annotations
FirebaseModelDataType.DataType | Supported data types for FirebaseModelInputs . |
---|---|
Classes
FirebaseCustomLocalModel | Describes a local model created from local or asset files. |
---|---|
FirebaseCustomLocalModel.Builder | Builder class of FirebaseCustomLocalModel . |
FirebaseCustomRemoteModel | Describes a remote model to be downloaded to the device. |
FirebaseCustomRemoteModel.Builder | Builder of FirebaseCustomRemoteModel . |
FirebaseModelDataType | Data types supported by FirebaseModelInputs . |
FirebaseModelInputOutputOptions | Configurations for data types and dimensions of input and output data. |
FirebaseModelInputOutputOptions.Builder | Builder class to build FirebaseModelInputOutputOptions . |
FirebaseModelInputs | Input data for FirebaseModelInterpreter . |
FirebaseModelInputs.Builder | Builder class of FirebaseModelInputs . |
FirebaseModelInterpreter | Interpreter to run custom models with TensorFlow Lite (requires API level 16+)A model interpreter is created via getInstance(FirebaseModelInterpreterOptions) Follow the steps below to specify the FirebaseCustomRemoteModel or FirebaseCustomLocalModel , create a FirebaseModelInterpreterOptions and then create a FirebaseModelInterpreter and all the way to running an inference with the model. |
FirebaseModelInterpreterOptions | Immutable options to configure model interpreter FirebaseModelInterpreter . |
FirebaseModelInterpreterOptions.Builder | Builder class of FirebaseModelInterpreterOptions . |
FirebaseModelOutputs | Stores inference results. |
4.2.3 com.google.firebase.ml.naturallanguage
Classes
FirebaseNaturalLanguage | Entry class for Firebase machine learning natural language services. |
---|---|
4.2.3.1 com.google.firebase.ml.naturallanguage.languageid
Classes
FirebaseLanguageIdentification | Entry point for Language Identification. |
---|---|
FirebaseLanguageIdentificationOptions | Options for FirebaseLanguageIdentification |
FirebaseLanguageIdentificationOptions.Builder | Builder to create a FirebaseLanguageIdentificationOptions instance. |
IdentifiedLanguage | A language identified by identifyPossibleLanguages(String) . |
4.2.3.2 com.google.firebase.ml.naturallanguage.smartreply
Annotations
SmartReplySuggestionResult.Status | All possible status codes for a Smart Reply suggestion attempt. |
---|---|
Classes
FirebaseSmartReply | Entry class for Firebase Smart Reply, which automatically suggests meaningful replies to a user input message. |
---|---|
FirebaseTextMessage | Represents a text message from a certain user in a conversation, providing context for SmartReply to generate reply suggestions. |
SmartReplySuggestion | A suggested reply to a given text. |
SmartReplySuggestionResult | The suggested result from the FirebaseSmartReply for the given text. |
4.2.3.3 com.google.firebase.ml.naturallanguage.translate
Annotations
FirebaseTranslateLanguage.TranslateLanguage | A language supported by the Translate API. |
---|---|
Classes
FirebaseTranslateLanguage | Information about the languages that are supported by the Translate API. |
---|---|
FirebaseTranslateRemoteModel | Information about a downloaded or to-be-downloaded model for translation. |
FirebaseTranslateRemoteModel.Builder | Builder for a FirebaseTranslateRemoteModel . |
FirebaseTranslator | Entry point for Translation. |
FirebaseTranslatorOptions | Options for FirebaseTranslator |
FirebaseTranslatorOptions.Builder | Builder to create a FirebaseTranslatorOptions instance. |
4.2.4 com.google.firebase.ml.vision
Classes
FirebaseVision | Entry class for Firebase machine learning vision services. |
---|---|
4.2.4.1 com.google.firebase.ml.vision.common
Annotations
FirebaseVisionImageMetadata.ImageFormat | Accepted image format of vision APIs. |
---|---|
FirebaseVisionImageMetadata.Rotation | Indicates the image rotation. |
Classes
FirebaseVisionImage | Represents an image object that can be used for both on-device and cloud API detectors. |
---|---|
FirebaseVisionImageMetadata | Image metadata used by FirebaseVision detectors. |
FirebaseVisionImageMetadata.Builder | Builder class of FirebaseVisionImageMetadata . |
FirebaseVisionLatLng | An object representing a latitude/longitude pair. |
FirebaseVisionPoint | Represent a 2D or 3D point for FirebaseVision . |
4.2.4.2 com.google.firebase.ml.vision.automl
Classes
FirebaseAutoMLLocalModel | Describes a local model created from local or asset files. |
---|---|
FirebaseAutoMLLocalModel.Builder | Builder class of FirebaseAutoMLLocalModel . |
FirebaseAutoMLRemoteModel | Describes a remote model to be downloaded to the device. |
FirebaseAutoMLRemoteModel.Builder | Builder of FirebaseAutoMLRemoteModel . |
4.2.4.3 com.google.firebase.ml.vision.barcode
Annotations
FirebaseVisionBarcode.Address.AddressType | Address type constants. |
---|---|
FirebaseVisionBarcode.BarcodeFormat | Barcode format constants - enumeration of supported barcode formats. |
FirebaseVisionBarcode.BarcodeValueType | Barcode value type constants - enumeration of supported barcode content value typesSupported types include:TYPE_UNKNOWN``TYPE_CONTACT_INFO``TYPE_EMAIL``TYPE_ISBN``TYPE_PHONE``TYPE_PRODUCT``TYPE_SMS``TYPE_TEXT``TYPE_URL``TYPE_WIFI``TYPE_GEO``TYPE_CALENDAR_EVENT``TYPE_DRIVER_LICENSE |
FirebaseVisionBarcode.Email.FormatType | Email format type constants. |
FirebaseVisionBarcode.Phone.FormatType | Phone number format type constants. |
FirebaseVisionBarcode.WiFi.EncryptionType | Wifi encryption type constants. |
Classes
FirebaseVisionBarcode | Represents a single recognized barcode and its value. |
---|---|
FirebaseVisionBarcode.Address | An address. |
FirebaseVisionBarcode.CalendarDateTime | DateTime data type used in calendar events. |
FirebaseVisionBarcode.CalendarEvent | A calendar event extracted from QRCode. |
FirebaseVisionBarcode.ContactInfo | A person’s or organization’s business card. |
FirebaseVisionBarcode.DriverLicense | A driver license or ID card. |
FirebaseVisionBarcode.Email | An email message from a ‘MAILTO:’ or similar QRCode type. |
FirebaseVisionBarcode.GeoPoint | GPS coordinates from a ‘GEO:’ or similar QRCode type. |
FirebaseVisionBarcode.PersonName | A person’s name, both formatted version and individual name components. |
FirebaseVisionBarcode.Phone | Phone number info. |
FirebaseVisionBarcode.Sms | A sms message from a ‘SMS:’ or similar QRCode type. |
FirebaseVisionBarcode.UrlBookmark | A URL and title from a ‘MEBKM:’ or similar QRCode type. |
FirebaseVisionBarcode.WiFi | A wifi network parameters from a ‘WIFI:’ or similar QRCode type. |
FirebaseVisionBarcodeDetector | Recognizes barcodes (in a variety of 1D and 2D formats) in a supplied FirebaseVisionImage . |
FirebaseVisionBarcodeDetectorOptions | Options for FirebaseVisionBarcodeDetector . |
FirebaseVisionBarcodeDetectorOptions.Builder | Builder to build out a FirebaseVisionBarcodeDetectorOptions . |
4.2.4.4 com.google.firebase.ml.vision.cloud
Annotations
FirebaseVisionCloudDetectorOptions.ModelType | Model types for cloud vision APIs: STABLE_MODEL and LATEST_MODEL . |
---|---|
Classes
FirebaseVisionCloudDetectorOptions | Options for all cloud vision detectors (e.g. |
---|---|
FirebaseVisionCloudDetectorOptions.Builder | Builder of FirebaseVisionCloudDetectorOptions . |
4.2.4.4.1 com.google.firebase.ml.vision.cloud.landmark
Classes
FirebaseVisionCloudLandmark | Represents a detected landmark by FirebaseVisionCloudLandmark . |
---|---|
FirebaseVisionCloudLandmarkDetector | Detector for finding popular natural and man-made structures within an image. |
4.2.4.5 com.google.firebase.ml.vision.document
Annotations
FirebaseVisionDocumentText.RecognizedBreak.BreakType | Detected start or end of a structural component type: UNKNOWN , SPACE``SURE_SPACE , EOL_SURE_SPACE , HYPHEN , LINE_BREAK . |
---|---|
Classes
FirebaseVisionCloudDocumentRecognizerOptions | Represents the cloud document recognizer options. |
---|---|
FirebaseVisionCloudDocumentRecognizerOptions.Builder | Builder of FirebaseVisionCloudDocumentRecognizerOptions . |
FirebaseVisionDocumentText | Represents detected text by FirebaseVisionDocumentTextRecognizer . |
FirebaseVisionDocumentText.Block | A logical element on the page. |
FirebaseVisionDocumentText.Paragraph | A structural unit of text representing a number of words in certain order. |
FirebaseVisionDocumentText.RecognizedBreak | Detected start or end of a structural component. |
FirebaseVisionDocumentText.Symbol | A single symbol representation. |
FirebaseVisionDocumentText.Word | A single word representation. |
FirebaseVisionDocumentTextRecognizer | Detector for performing optical character recognition(OCR) on an input image by sending the image to Google cloud ML backend. |
4.2.4.6 com.google.firebase.ml.vision.face
Annotations
FirebaseVisionFaceContour.ContourType | Contour types for face. |
---|---|
FirebaseVisionFaceDetectorOptions.ClassificationMode | Indicates whether to run additional classifiers for characterizing attributes such as “smiling” and “eyes open”. |
FirebaseVisionFaceDetectorOptions.ContourMode | Sets whether to detect contours or not. |
FirebaseVisionFaceDetectorOptions.LandmarkMode | Sets whether to detect no landmarks or all landmarks. |
FirebaseVisionFaceDetectorOptions.PerformanceMode | Extended option for controlling additional accuracy / speed trade-offs in performing face detection. |
FirebaseVisionFaceLandmark.LandmarkType | Landmark types for face. |
Classes
FirebaseVisionFace | Represents a face detected by FirebaseVisionFaceDetector . |
---|---|
FirebaseVisionFaceContour | Represent a face contour. |
FirebaseVisionFaceDetector | Detector for finding FirebaseVisionFace s in a supplied image. |
FirebaseVisionFaceDetectorOptions | Options for FirebaseVisionFaceDetector . |
FirebaseVisionFaceDetectorOptions.Builder | Builder class of FirebaseVisionFaceDetectorOptions . |
FirebaseVisionFaceLandmark | Represent a face landmark. |
4.2.4.7 com.google.firebase.ml.vision.label
Annotations
FirebaseVisionImageLabeler.ImageLabelerType | Image Labeler types. |
---|---|
Classes
FirebaseVisionCloudImageLabelerOptions | Options for cloud image labeler. |
---|---|
FirebaseVisionCloudImageLabelerOptions.Builder | Builder of FirebaseVisionOnDeviceImageLabelerOptions . |
FirebaseVisionImageLabel | Represents an image label detected byFirebaseVisionImageLabeler . |
FirebaseVisionImageLabeler | Used for finding FirebaseVisionImageLabel s in a supplied image. |
FirebaseVisionOnDeviceAutoMLImageLabelerOptions | Options for on device automl image labeler. |
FirebaseVisionOnDeviceAutoMLImageLabelerOptions.Builder | Builder of FirebaseVisionOnDeviceImageLabelerOptions . |
FirebaseVisionOnDeviceImageLabelerOptions | Options for on device image labeler. |
FirebaseVisionOnDeviceImageLabelerOptions.Builder | Builder of FirebaseVisionOnDeviceImageLabelerOptions . |
4.2.4.8 com.google.firebase.ml.vision.objects
Annotations
FirebaseVisionObject.Category | Classification category of detected objects. |
---|---|
FirebaseVisionObjectDetectorOptions.DetectorMode | The detector mode which indicates whether detection is for single image or for streaming. |
Classes
FirebaseVisionObject | Represents a detected object by FirebaseVisionObjectDetector . |
---|---|
FirebaseVisionObjectDetector | Detector for finding FirebaseVisionObject s in a supplied image. |
FirebaseVisionObjectDetectorOptions | Options for FirebaseVisionObjectDetector . |
FirebaseVisionObjectDetectorOptions.Builder | Builder of FirebaseVisionObjectDetectorOptions . |
4.2.4.9 com.google.firebase.ml.vision.text
Annotations
FirebaseVisionCloudTextRecognizerOptions.CloudTextModelType | Cloud model types for text recognition. |
---|---|
FirebaseVisionTextRecognizer.RecognizerType | Recognizer types. |
Classes
FirebaseVisionCloudTextRecognizerOptions | Represent the cloud text recognizer options. |
---|---|
FirebaseVisionCloudTextRecognizerOptions.Builder | Builder of FirebaseVisionCloudTextRecognizerOptions . |
FirebaseVisionText | A hierarchical representation of texts. |
FirebaseVisionText.Element | Roughly equivalent to a space-separated “word” in most Latin languages, or a character in others. |
FirebaseVisionText.Line | Represents a line of text. |
FirebaseVisionText.TextBlock | A block of text (think of it as a paragraph) as deemed by the OCR engine. |
FirebaseVisionTextRecognizer | Text recognizer for performing optical character recognition(OCR) on an input image. |
RecognizedLanguage | Recognized language for a structural component. |
4.3 层次结构
从包名上看,ML Kit按照功能进行模块划分。
从类关系来看,就是一堆类的组合,前缀有功能标记,都从Object派生出来。
4.4 使用方法
首先在FireBase console创建项目,然后导出FireBase.json配置文件,最后在本地项目内导入FireBase.json配置文件,build.gradle里面添加依赖包就可以开发了。
4.5 版本变更历史
ML Kit尚处在Beta阶段,自从2018年5月发布以来,变化非常多,基本1~2个月一个版本,更新日期如下:
|
|
部分更新列表如下,主要是bug fix和增加新的功能,详细更新历史见 https://firebase.google.com/support/release-notes/android
November 22, 2019
|
|
October 16, 2019
|
|
September 26, 2019
|
|
.
August 16, 2018:ML Kit version 17.0.0
|
|
June 12, 2018:ML Kit version 16.0.0
|
|
May 8, 2018:ML Kit首次发布
5 总结
• 对外API接口:java SDK包
• 开放方式:Google Android SDK,仅开放接口,实现闭源。
• 模型情况:系统提供或者用户自定义
• 集成方式:用户集成SDK,调用API接口方式集成
• 部署位置:云端或者移动设备上
• 升级:按业务场景单独更新对应SDK模块。
• 演进:尚处于beta阶段,未来以修复bug,增加新的应用场景为主。
文章作者 carter2005
上次更新 2020-03-06