The law states that scanners can be used to verify a customer’s age, and “to collect or disclose personal information that is required for reporting, investigating, or preventing fraud, abuse, or material misrepresentation.” The law also stipulates that the data should not be retained or used for any other purpose. How to use (almost) free tools to find sensitive data No matter how much security awareness training employees get, some of them will still store sensitive data in insecure places.
-->
Applies to: Azure Information Protection, Windows Server 2019, Windows Server 2016, Windows Server 2012 R2
Note
This article is for the current general availability version of the Azure Information Protection scanner with the Azure Information Protection client (classic), and the preview version of the scanner for the current preview version of the Azure Information Protection unified labeling client.
If you have previously installed the scanner and want to upgrade it, use the following upgrade instructions and then use the instructions on this page, omitting the step to install the scanner:
If you have a version of the scanner that is older than 1.48.204.0 and you're not ready to upgrade it, see Deploying previous versions of the Azure Information Protection scanner to automatically classify and protect files.
Use this information to learn about the Azure Information Protection scanner, and then how to successfully install, configure, and run it.
This scanner runs as a service on Windows Server and lets you discover, classify, and protect files on the following data stores:
To scan and label files on cloud repositories, use Cloud App Security instead of the scanner.
Overview of the Azure Information Protection scanner
When you have configured labels that apply automatic classification, files that this scanner discovers can then be labeled. Labels apply classification, and optionally, apply protection or remove protection:
The scanner can inspect any files that Windows can index, by using IFilters that are installed on the computer. Then, to determine if the files need labeling, the scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns. Because the scanner uses the Azure Information Protection client (the classic client or unified labeling client), the scanner can classify and protect the same file types:
You can run the scanner in discovery mode only, where you use the reports to check what would happen if the files were labeled. Or, you can run the scanner to automatically apply the labels. You can also run the scanner to discover files that contain sensitive information types, without configuring labels for conditions that apply automatic classification.
Note that the scanner does not discover and label in real time. It systematically crawls through files on data stores that you specify, and you can configure this cycle to run once, or repeatedly.
You can specify which file types to scan, or exclude from scanning, by defining a file types list as part of the scanner configuration.
Prerequisites for the Azure Information Protection scanner
Before you install the Azure Information Protection scanner, make sure that the following requirements are in place.
If you can't meet all the requirements in the table because they are prohibited by your organization policies, see the next section for alternatives.
If all the requirements are met, go straight to configuring the scanner section.
Deploying the scanner with alternative configurations
The prerequisites listed in the table are the default requirements for the scanner and recommended because they are the simplest configuration for the scanner deployment. They should be suitable for initial testing, so that you can check the capabilities of the scanner. However, in a product environment, your organization policies might prohibit these default requirements because of one or more of the following restrictions:
The scanner can accommodate these restrictions but they require additional configuration.
Restriction: The scanner server cannot have Internet connectivity
Supported by the classic client only: Follow the instructions for a disconnected computer. Then, do the following:
Note that in this configuration, the scanner cannot apply protection (or remove protection) by using your organization's cloud-based key. Instead, the scanner is limited to using labels that apply classification only, or protection that uses HYOK.
Restriction: You cannot be granted Sysadmin or databases must be created and configured manually
If you can be granted the Sysadmin role temporarily to install the scanner, you can remove this role when the scanner installation is complete. When you use this configuration, the database is automatically created for you and the service account for the scanner is automatically granted the required permissions. However, the user account that configures the scanner requires the db_owner role for the scanner configuration database, and you must manually grant this role to the user account.
If you cannot be granted the Sysadmin role even temporarily, you must ask a user with Sysadmin rights to manually create a database before you install the scanner. For this configuration, the following roles must be assigned:
Typically, you will use the same user account to install and configure the scanner. But if you use different accounts, they both require the db_owner role for the scanner configuration database:
To create a user and grant db_owner rights on this database, ask the Sysadmin to run the following SQL script twice. The first time, for the service account that runs the scanner, and the second time for you to install and manage the scanner. Before running the script:
SQL script:
Additionally:
If, after configuring these permissions, you see an error when you install the scanner, the error can be ignored and you can manually start the scanner service.
Restriction: The service account for the scanner cannot be granted the Log on locally right
If your organization policies prohibit the Log on locally right for service accounts but allow the Log on as a batch job right, use the following instructions:
Restriction: The scanner service account cannot be synchronized to Azure Active Directory but the server has Internet connectivity
You can have one account to run the scanner service and use another account to authenticate to Azure Active Directory:
Configure the scanner in the Azure portal
Before you install the scanner, or upgrade it from an older general availability version of the scanner, create a profile for the scanner in the Azure portal. You configure the profile for scanner settings, and the data repositories to scan.
You're now ready to install the scanner with the scanner profile that you've just created.
Install the scanner
Now that you have installed the scanner, you need to get an Azure AD token for the scanner service account to authenticate, so that the scanner can run unattended.
Get an Azure AD token for the scanner
The Azure AD token lets the scanner service account authenticate to the Azure Information Protection service.
The scanner now has a token to authenticate to Azure AD, which is valid for one year, two years, or never expires, according to your configuration of the Web app /API (classic client) or client secret (unified labeling client) in Azure AD. When the token expires, you must repeat steps 1 and 2.
You're now ready to run your first scan in discovery mode.
Run a discovery cycle and view reports for the scanner
The Azure portal displays information about the last scan only. If you need to see the results of previous scans, return to the reports that are stored on the scanner computer, in the %localappdata%MicrosoftMSIPScannerReports folder.
When you're ready to automatically label the files that the scanner discovers, continue to the next procedure.
Configure the scanner to apply classification and protection
If you are following these instructions, the scanner runs one time and in the reporting-only mode. To change these settings, edit the scanner profile:
Because we configured the schedule to run continuously, when the scanner has worked its way through all the files, it automatically starts a new cycle so that any new and changed files are discovered.
How files are scanned
The scanner runs through the following processes when it scans files.
1. Determine whether files are included or excluded for scanning
The scanner automatically skips files that are excluded from classification and protection, such as executable files and system files. For more information, see the following admin guides:
You can change this behavior by defining a list of file types to scan, or exclude from scanning. You can specify this list for the scanner to apply to all data repositories by default, and you can specify a list for each data repository. To specify this list, use the Files types to scan setting in the scanner profile:
2. Inspect and label files
The scanner then uses filters to scan supported file types. These same filters are used by the operating system for Windows Search and indexing. Without any additional configuration, Windows IFilter is used to scan file types that are used by Word, Excel, PowerPoint, and for PDF documents and text files.
For a full list of file types that are supported by default, and additional information how to configure existing filters that include .zip files and .tiff files, see the following admin guides:
After inspection, these file types can be labeled by using the conditions that you specified for your labels. Or, if you're using discovery mode, these files can be reported to contain the conditions that you specified for your labels, or all known sensitive information types.
However, the scanner cannot label the files under the following circumstances:
For example, after inspecting files that have a file name extension of .txt, the scanner can't apply a label that's configured for classification but not protection, because the .txt file type doesn't support classification-only. If the label is configured for classification and protection, and the registry is edited for the .txt file type, the scanner can label the file.
Tip
During this process, if the scanner stops and doesn't complete scanning a large number of the files in a repository:
3. Label files that can't be inspected
For the file types that can't be inspected, the scanner applies the default label in the Azure Information Protection policy, or the default label that you configure for the scanner.
As in the preceding step, the scanner cannot label the files under the following circumstances:
Editing the registry for the scanner
This section applies to the scanner from the classic client only.
To change the default scanner behavior for protecting file types other than Office files and PDFs, you must manually edit the registry and specify the additional file types that you want to be protected, and the type of protection (native or generic). For instructions, see File API configuration from the developer guidance. In this documentation for developers, generic protection is referred to as 'PFile'. In addition, specific for the scanner:
When you edit the registry, manually create the MSIPC key and FileProtection key if they do not exist, as well as a key for each file name extension.
For example, for the scanner to protect TIFF images in addition to Office files and PDFs, the registry after you have edited it, will look similar to the following picture. As an image file, TIFF files support native protection and the resulting file name extension is .ptiff.
For a list of text and images file types that similarly support native protection but must be specified in the registry, see Supported file types for classification and protection.
For files that don't support native protection, specify the file name extension as a new key, and PFile for generic protection. The resulting file name extension for the protected file is .pfile.
When files are rescanned
For the first scan cycle, the scanner inspects all files in the configured data stores and then for subsequent scans, only new or modified files are inspected.
You can force the scanner to inspect all files again from the Azure Information Protection - Profiles blade in the Azure portal. Select your scanner profile from the list, and then select the Rescan all files option:
Inspecting all files again is useful when you want the reports to include all files and this configuration choice is typically used when the scanner runs in discovery mode. When a full scan is complete, the scan type automatically changes to incremental so that for subsequent scans, only new or modified files are scanned.
In addition, all files are inspected when the scanner from the classic client downloads an Azure Information Protection policy that has new or changed conditions and the scanner from the unified labeling client has new or changed settings for automatic and recommended labeling.
The scanner refreshes the policy according to the following triggers:
Tip
If you need to refresh the policy sooner than the default interval, for example, during a testing period:
Then restart the Azure Information Scanner service. If you changed protection settings for your labels, also wait 15 minutes from when you saved the protection settings before you restart the service.
Editing in bulk for the data repository settings
For the data repositories that you've added to a scanner profile, you can use the Export and Import options to quickly make changes to the settings. For example, for your SharePoint data repositories, you want to add a new file type to exclude from scanning.
Instead of editing each data repository in the Azure portal, use the Export option from the Repositories blade:
Manually edit the file to make the change, and then use the Import option on the same blade.
Using the scanner with alternative configurations
There are three alternative scenarios that the Azure Information Protection scanner supports where labels do not need to be configured for any conditions:
Optimizing the performance of the scanner
Use the following guidance to help you optimize the performance of the scanner. However, if your priority is the responsiveness of the scanner computer rather than the scanner performance, you can use an advanced client setting to limit the number of threads used by the scanner (classic client only).
To maximize the scanner performance:
If necessary, install multiple instances of the scanner. The Azure Information Protection scanner supports multiple configuration databases on the same SQL server instance when you specify a custom profile name for the scanner. For the scanner from the unified labeling client, multiple scanners can share the same profile, which results in quicker scanning times.
Other factors that affect the scanner performance:
List of cmdlets for the scanner
Because you now configure the scanner from the Azure portal, cmdlets from previous versions that configured data repositories and the scanned file types list are now deprecated.
The cmdlets that remain include cmdlets that install and upgrade the scanner, change the scanner configuration database and profile, change the local reporting level, and import configuration settings for a disconnected computer.
The full list of cmdlets for the scanner:
Event log IDs and descriptions for the scanner
Use the following sections to identify the possible event IDs and descriptions for the scanner. These events are logged on the server that runs the scanner service, in the Windows Applications and Services event log, Azure Information Protection.
Information 910
Scanner cycle started.
This event is logged when the scanner service is started and begins to scan for files in the data repositories that you specified.
Information 911
Scanner cycle finished.
This event is logged when the scanner has finished a manual scan, or the scanner has finished a cycle for a continuous schedule.
If the scanner was configured to run manually rather than continuously, to scan the files again, set the Schedule to Manual or Always in the scanner profile, and then restart the service.
Next steps
Interested in how the Core Services Engineering and Operations team in Microsoft implemented this scanner? Read the technical case study: Automating data protection with Azure Information Protection scanner.
You might be wondering: What’s the difference between Windows Server FCI and the Azure Information Protection scanner?
You can also use PowerShell to interactively classify and protect files from your desktop computer. For more information about this and other scenarios that use PowerShell, see the following sections from the admin guides:
Comments are closed.
|
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |